feat(skills): add tts-voiceover skill for Azure Speech SDK voice-over generation#1415
feat(skills): add tts-voiceover skill for Azure Speech SDK voice-over generation#1415
Conversation
Dependency ReviewThe following issues were found:
License Issues.github/skills/experimental/tts-voiceover/uv.lock
plugins/hve-core-all/skills/experimental/tts-voiceover/uv.lock
OpenSSF ScorecardScorecard details
Scanned Files
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1415 +/- ##
==========================================
- Coverage 87.44% 82.87% -4.57%
==========================================
Files 68 56 -12
Lines 10335 7469 -2866
==========================================
- Hits 9037 6190 -2847
+ Misses 1298 1279 -19
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Dependency Review — tts-voiceover skill
Summary
This PR adds a new tts-voiceover skill with 4 runtime dependencies and a uv.lock file pinning all transitive dependencies.
| Package | Version (lock) | License | Classification |
|---|---|---|---|
azure-cognitiveservices-speech |
1.49.0 | MIT | New to skills ecosystem |
azure-identity |
(via azure-core) | MIT | New to skills ecosystem |
python-pptx |
(pinned in lock) | MIT | Already used in powerpoint skill |
pyyaml |
(pinned in lock) | MIT | Already used in powerpoint skill |
Dev/fuzz dependencies (pytest, ruff, atheris) follow the established repo pattern and are consistent with other skills.
1. New Dependencies
azure-cognitiveservices-speech v1.49.0 — Microsoft's Azure Cognitive Services Speech SDK. MIT licensed, actively maintained by Microsoft, uploaded 2026-04-07. Necessary for the core TTS functionality; no equivalent exists in the project. ✅
azure-identity — Microsoft's Azure Identity library for Entra ID token auth. MIT licensed, well-maintained. Necessary for the SPEECH_RESOURCE_ID auth path; no duplicate in the project. ✅
python-pptx and pyyaml — Both already present in the sibling powerpoint skill. Licenses confirmed MIT. Each skill carries its own pyproject.toml/lock file per repo convention, so this duplication is expected and correct. ✅
2. Version Specifiers
Runtime dependencies in pyproject.toml are unpinned (no minimum version constraints). This matches the pattern used in the powerpoint skill and other experimental skills in this repo. The uv.lock file provides actual reproducible pinning at install time, which is the uv-recommended approach. ✅
3. SHA Pinning Compliance
No GitHub Actions workflow files were modified. N/A. ✅
4. Devcontainer and Setup Alignment
No changes to .devcontainer/ or copilot-setup-steps.yml. N/A. ✅
Observations
- The dev group omits
pytest-mockandpytest-covthat other skills include. This is not blocking, but worth noting if test coverage tooling is expected. azure-cognitiveservices-speechdeclaresazure-coreas a transitive dependency, which in turn pulls inrequests,certifi, andcharset-normalizer. All are permissively licensed (Apache-2.0 or MIT) and compatible with this project's MIT license.
All safety checks passed. No blocking issues found.
Generated by Dependabot PR Review for issue #1415 · ● 1.6M
There was a problem hiding this comment.
Automated Quality Review — feat(skills): add tts-voiceover skill
This PR introduces a well-structured Python skill for Azure TTS voice-over generation. The code quality is generally high — copyright headers, exit codes, argparse/pathlib/logging patterns, type hints, and the fuzz harness all follow repo conventions. There are, however, three blocking issues that must be resolved before merge.
🔗 Issue Alignment
❌ No linked issue.
The "Related Issue(s)" section contains only None. All PRs must reference a linked issue using Fixes #, Closes #, or Resolves #. Please open an issue describing the motivation for this skill and link it here.
📋 PR Template Compliance
❌ Required automated checks not completed.
The following items in the Required Automated Checks section remain unchecked with no indication they were run or are N/A:
| Check | Status |
|---|---|
npm run spell-check |
❌ unchecked |
npm run lint:md-links |
❌ unchecked |
npm run docs:test |
❌ unchecked (may be N/A for skill-only changes — please mark as N/A if so) |
npm run lint:ps is N/A here (no PowerShell files changed) and can be annotated accordingly.
❌ AI Artifact Contributions checklist items unchecked.
Used /prompt-analyze to review contribution and Addressed all feedback from prompt-builder review are both unchecked. These require manual verification by the author.
🛠️ Coding Standards
Python scripts comply with repo conventions (python-script.instructions.md, python-test.instructions.md): copyright headers ✅, exit codes ✅, create_parser() ✅, pathlib ✅, structured logging ✅, if __name__ == "__main__" guard ✅, Google-style docstrings ✅. The fuzz harness follows the polyglot pattern with pytest fallback ✅. The pyproject.toml contains all required sections ([tool.ruff], [tool.ruff.lint], [tool.pytest.ini_options], fuzz group) ✅.
🔍 Code Quality
❌ [Blocking] SSML XML injection via unescaped speaker notes — inline comment on generate_voiceover.py line 90.
Speaker notes placed into the SSML document are not XML-escaped. Any note containing &, <, or > produces malformed XML that Azure TTS will reject. Fix by calling xml.sax.saxutils.escape(notes) before apply_acronym_aliases. See inline comment for the one-line fix.
❌ [Blocking] Non-existent file referenced in hve-core-all.collection.yml — inline comment on line 506.
.github/instructions/security-planning/mcp-security.instructions.md does not exist in the repository and was not added by this PR. This is unrelated scope creep that will cause npm run lint:collections-metadata to fail. Remove the two added lines or submit them in a separate PR along with the actual file.
generate_voiceover.py line 214.
The token is fetched once and reused for all slides. Tokens expire after ~1 hour, which may cause mid-run 401 failures on large decks. A refresh-before-expiry pattern is suggested in the inline comment.
i️ [Informational] Partial substring matching in apply_acronym_aliases.
str.replace() can match an acronym inside a longer word (e.g., AST inside FASTEST). The longest-first sort mitigates prefix overlaps but not embedded occurrences. Consider using \b-bounded regex for word-boundary matching when this matters for user-supplied content.
✅ Action Items
- Open a tracking issue and add a
Fixes #reference to the PR description. - Remove the
mcp-security.instructions.mdentry fromcollections/hve-core-all.collection.yml. - XML-escape
notesbefore passing toapply_acronym_aliases(one-line fix at line 247). - Run and check
npm run spell-checkandnpm run lint:md-links; update the checklist. - Address the unchecked AI Artifact Contributions items or explain why they are N/A.
516d4af to
03de7ed
Compare
There was a problem hiding this comment.
PR Review
This PR adds a well-conceived tts-voiceover skill with solid overall structure — good SSML handling, dual auth support, a proper fuzz harness, and thorough documentation. Three issues need to be resolved before merging.
Issue Alignment
The linked issue (#1417) could not be read due to access restrictions. Based on the PR description alone, the core tts-voiceover skill additions appear well-scoped and purposeful. However, the addition of mcp-security.instructions.md to hve-core-all.collection.yml (see inline comment) introduces unrelated scope.
PR Template Compliance
Two checkbox gaps:
-
Script/automation (.ps1, .sh, .py)not checked — This PR adds two Python scripts (generate_voiceover.py,embed_audio.py). The "Script/automation" checkbox under Other in the Type of Change section should be checked. -
New featurenot checked — Adding a new skill is a non-breaking new feature. The "New feature" checkbox under Code & Documentation should also be checked. -
Skill script requirement — The PR template states: "Skills: Must include both bash and PowerShell scripts." This skill ships Python-only scripts with no
generate_voiceover.sh/generate_voiceover.ps1wrappers. Cross-platform wrapper scripts (e.g., setting up theuvenvironment and invoking the Python entrypoints) are required to meet the stated skill contribution standard. Seedocs/contributing/skills.md.
Coding Standards
See inline comments for specifics. Summary:
⚠️ wrap_ssmlXML injection (generate_voiceover.pylines 88–89):voiceandrateCLI arguments are inserted into XML attribute positions without escaping. The PR correctly escapes speaker notes but misses these two attributes. Usexml.sax.saxutils.quoteattr()as shown in the inline comment.- ❌ Unrelated/contradictory collection change (
hve-core-all.collection.ymlline 506): PR description claimsmcp-security.instructions.mdwas removed, but the diff adds it. This entry is also missing amaturityfield. Clarify and correct. - 💡 Unpinned Azure SDK versions (
pyproject.tomllines 6–9): Suggestion to add minimum version bounds forazure-cognitiveservices-speechandazure-identity(non-blocking, but recommended).
Code Quality
Beyond the issues above, the implementation is clean:
- Correct XML escaping of speaker notes before SSML substitution ✅
- Entra ID token refresh logic 5 min before expiry ✅
longest-firstacronym sort prevents partial-match shadowing ✅EXIT_SUCCESS/EXIT_FAILURE/EXIT_ERRORexit code constants ✅- Proper
from __future__ import annotationsand modern type hints ✅ - Fuzz harness with Atheris/pytest polyglot mode ✅
Required Actions
- Escape
voiceandrateinwrap_ssmlusingxml.sax.saxutils.quoteattr(). - Resolve the
mcp-security.instructions.mddiscrepancy inhve-core-all.collection.yml(remove if accidental; addmaturityfield and correct PR description if intentional). - Add bash and PowerShell wrapper scripts for the two Python entrypoints to satisfy the skill contribution requirement, or document why Python-only is acceptable for this skill.
- Check the
Script/automationandNew featurecheckboxes in the Type of Change section.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneThere was a problem hiding this comment.
Automated PR Review — feat(skills): add tts-voiceover skill
Verdict: REQUEST_CHANGES — one security finding must be resolved before merge; three additional quality improvements are strongly recommended.
Issue Alignment
Linked issue Fixes #1417 is present. The diff is consistent with a new TTS skill addition. No obvious scope creep detected.
PR Template Compliance
| Item | Status |
|---|---|
| Description filled in | ✅ |
| Related Issue(s) | ✅ Fixes #1417 |
| Type of Change — at least one checked | ✅ "Copilot skill" |
| Testing section | ✅ described |
| Required checklist |
npm run lint:md-links is marked "deferred to CI" and left unchecked. Per the PR template instructions, this must be verified and checked before merge.
.py, .ps1, .sh) but the "Script or automation" Type of Change checkbox is not selected. Please tick it.
mcp-security.instructions.md was removed from hve-core-all.collection.yml. The actual diff shows no such removal in that file. The plugins/hve-core-all/README.md change adds mcp-security.instructions to the table. Please remove or correct this note.
Coding Standards
Applicable instruction files checked:
python-script.instructions.md— Python conventions ✅ (modulo findings below)powershell.instructions.md— PowerShell conventions⚠️ (duplication finding)bash.instructions.md— bash scripts ✅ cleanmarkdown.instructions.md— SKILL.md ✅ clean; frontmatter validprompt-builder.instructions.md— SKILL.md skill structure ✅ correct
Code Quality & Security
🔒 [BLOCKING] SSML attribute injection via alias values (generate_voiceover.py:76)
xml.sax.saxutils.escape(alias) does not encode ". If an acronyms.yaml alias value contains a double-quote, it escapes the alias="..." attribute in the generated SSML, enabling attribute injection or document structure corruption. Fix: xml.sax.saxutils.escape(alias, {'"': """}) or html.escape(alias, quote=True). See inline comment.
generate_voiceover.py:205)
speech_key, speech_region, and speech_resource_id are assigned only inside if not args.dry_run: but are referenced later in the loop body. The current guard prevents a NameError at runtime, but the code is fragile. Initialize to None before the conditional. See inline comment.
💡 Shared PowerShell helpers duplicated across two scripts (Invoke-EmbedAudio.ps1:56, Invoke-GenerateVoiceover.ps1:77)
Test-UvAvailability, Initialize-PythonEnvironment, and Get-VenvPythonPath are copy-pasted verbatim. Extract to scripts/Modules/TtsVoiceoverHelpers.psm1. See inline comment.
💡 Overly broad object type hint (embed_audio.py:39)
slide: object should be slide: pptx.slide.Slide for IDE support and type-checker correctness. See inline comment.
Action Items
- [Required] Fix SSML alias attribute encoding — encode
"as"inapply_acronym_aliases. - [Required] Check
npm run lint:md-linksand mark the checklist item. - [Required] Correct or remove the "Additional Notes" claim about
mcp-security.instructions.mdremoval. - [Required] Check "Script or automation" in the Type of Change section.
- [Recommended] Initialize
speech_key,speech_region,speech_resource_id = Nonebefore the dry-run block. - [Recommended] Extract shared PowerShell helpers to a
Modules/TtsVoiceoverHelpers.psm1. - [Recommended] Narrow
slide: objecttopptx.slide.Slide.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneThere was a problem hiding this comment.
PR Review — feat(skills): add tts-voiceover skill
This PR adds a well-structured new skill with good cross-platform coverage (Python + Bash + PowerShell wrappers), a working fuzz harness, and solid documentation. The token-refresh logic and acronym escaping in apply_acronym_aliases are thoughtful. Two issues require changes before merge, and a few smaller items are noted below.
🔒 Issue Alignment
Linked issue #1417 could not be independently verified (integrity policy), but the PR description is coherent, the implementation matches the stated goals, and no obvious scope creep is present.
⚠️ PR Template Compliance
Required fix — "Script/automation" checkbox not checked:
The PR adds generate_voiceover.py, embed_audio.py, generate-voiceover.sh, embed-audio.sh, Invoke-GenerateVoiceover.ps1, and Invoke-EmbedAudio.ps1. The "Script/automation (.ps1, .sh, .py)" checkbox under Other must be checked to accurately reflect the type of change.
Minor — First AI Artifacts checkbox absent from Type of Change section:
The template's first AI Artifacts checkbox ("Reviewed contribution with prompt-builder agent and addressed all feedback") was dropped entirely rather than left unchecked with an N/A note. The Checklist section does explicitly N/A it, which is appreciated, but the Type of Change section should retain the unchecked item for completeness.
🔒 Security (blocking)
Incomplete SSML injection fix — see inline comment on generate_voiceover.py lines 89–90. The PR correctly escapes speaker_notes via xml.sax.saxutils.escape before SSML construction, but the voice and rate CLI arguments are interpolated into XML attributes verbatim. A value containing " would break the XML document structure. Escaping these two parameters completes the fix the PR description claims.
💡 Code Quality (non-blocking suggestions)
speechsdk potentially unbound in static analysis:
speechsdk is assigned inside if not args.dry_run: and later referenced in the token-refresh block inside the slide loop. At runtime this is safe because the refresh code is unreachable during a dry run. However, static analysis tools (mypy, Pyright) will flag the name as potentially unbound after a refactor. A forward declaration (speechsdk = None) before the conditional block, or moving the import to module level with a try/except ImportError, would make the intent explicit.
_make_entra_config return type is tuple not tuple[SpeechConfig, float]:
The bare tuple return annotation drops type precision. Per the python-script conventions, typed return annotations are preferred. Since speechsdk is a conditional import, a TYPE_CHECKING-guarded annotation or a comment-style annotation is acceptable.
generate_audio uses object for speech_config:
Same concern — object accepts any value and loses static type safety. A forward reference or a TYPE_CHECKING import of azure.cognitiveservices.speech.SpeechConfig would improve IDE support.
📋 Collection / Conventions (non-blocking)
maturity field missing — see inline comment on collections/experimental.collection.yml. New experimental-collection skills should carry an explicit maturity: experimental (or preview) rather than silently defaulting to stable. The same applies to the matching entry in hve-core-all.collection.yml.
✅ What looks good
- Cross-platform wrappers (Bash + PowerShell) following repo conventions; copyright headers,
set -euo pipefail,main()pattern, and comment-based help are all present. pyproject.tomlcorrectly includes[tool.ruff],[tool.pytest.ini_options],fuzzdependency group withatheris>=3.0, andpython_files = ["test_*.py", "fuzz_harness.py"].- Fuzz harness is a proper polyglot (pytest + Atheris) covering
apply_acronym_aliases,wrap_ssml, andload_acronyms. - Token refresh logic (5-minute pre-expiry window) for Entra ID long-running decks is a nice reliability touch.
- Acronyms sorted longest-first to prevent partial substring replacement — correct and well-considered.
✅ Required Actions
- Complete the SSML escaping for
voiceandrateinwrap_ssml(see inline comment). - Check the "Script/automation" checkbox in the Type of Change section of the PR description.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneThere was a problem hiding this comment.
PR Review: feat(skills): add tts-voiceover skill
Overall the contribution is well-structured and shows careful attention to XML injection for speaker notes, cross-platform parity, and fuzz testing. However, there are a few required fixes before this can merge.
Issue Alignment
Linked issue #1417 could not be read due to repository access constraints. Based on the PR description, the implementation aligns with the stated goal of generating per-slide WAV voice-over files from YAML speaker notes using Azure Speech SDK.
No scope creep detected. The removal of the unrelated mcp-security.instructions.md reference from hve-core-all.collection.yml is a valid cleanup.
PR Template Compliance
.ps1, .sh, and .py files but only the Copilot skill box is checked. The Script/automation (.ps1, .sh, .py) checkbox under Other must also be checked to accurately reflect the change type.
Coding Standards Findings
Python (python-script.instructions.md)
generate_voiceover.pyandembed_audio.py— Missing interrupt/pipe error handling (inline comments on both files):main()must catchKeyboardInterrupt(return 130) andBrokenPipeError(close stderr, return 1) at the top level per the instructions. The fix requires extracting the body into arun()helper.
PowerShell (powershell.instructions.md)
Invoke-GenerateVoiceover.ps1andInvoke-EmbedAudio.ps1— Missing main execution guard (inline comments on both files): The instructions requireif ($MyInvocation.InvocationName -ne '.')wrapping the main execution block to enable dot-sourcing by Pester tests.
Security Findings
generate_voiceover.pywrap_ssml()— Incomplete SSML XML injection fix (inline comment): The PR description notes fixing SSML XML injection for speaker notes, andxml.sax.saxutils.escape()is correctly applied there. However,voiceandrateCLI parameters are embedded directly into XML attributes without escaping. Usexml.sax.saxutils.quoteattr()on both for consistency with the existing fix.
Action Items
| # | File | Issue | Severity |
|---|---|---|---|
| 1 | generate_voiceover.py |
Incomplete XML injection fix — escape voice and rate |
|
| 2 | generate_voiceover.py |
Missing KeyboardInterrupt/BrokenPipeError in main() |
|
| 3 | embed_audio.py |
Missing KeyboardInterrupt/BrokenPipeError in main() |
|
| 4 | Invoke-GenerateVoiceover.ps1 |
Missing main execution guard | |
| 5 | Invoke-EmbedAudio.ps1 |
Missing main execution guard | |
| 6 | PR description | Check Script/automation Type of Change checkbox |
📋 Template |
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneThere was a problem hiding this comment.
PR Review: feat(skills): add tts-voiceover skill
Overview
The new tts-voiceover skill is a well-structured addition — the Python scripts are organized clearly, the SKILL.md documentation is thorough, cross-platform wrappers are provided, and the fuzz harness is present. However, the review identified one security issue and several correctness / coding-standard violations that need to be addressed before merge.
🔒 Issue Alignment
Linked issue #1417 could not be fetched due to a repository access constraint. Review is based on the PR description, which presents a coherent narrative (generate TTS audio from slides, embed audio into PPTX). No obvious scope creep observed. If the issue contains acceptance criteria not reflected here, please verify them against the changes.
📋 PR Template Compliance
| Check | Status |
|---|---|
| Description filled in | ✅ |
| Related Issue(s) linked | ✅ (Fixes #1417) |
| Type of Change checked | .py, .ps1, and .sh files |
| Testing section filled in | ✅ |
| Required checklist items checked | ✅ |
| AI Artifact Contributions checklist | ✅ |
Action required: Check the Script or automation checkbox in the Type of Change section.
🔒 Security
SSML injection via unescaped --voice and --rate CLI arguments (generate_voiceover.py lines 88–89) — see inline comment.
voice and rate are CLI-supplied strings interpolated directly into XML attributes inside wrap_ssml(). Characters such as ", <, >, and & will produce malformed SSML. A value like en-US-Test&Debug generates invalid XML, and a crafted rate value could inject unexpected XML attributes. Both values must be passed through xml.sax.saxutils.escape(value, {'"': """}) before interpolation. Additionally, alias values in apply_acronym_aliases() (line 79) are substituted into attribute values and element text without escaping — the same xml.sax.saxutils.escape treatment is needed there.
🐛 Correctness
-
apply_acronym_aliasescan corrupt emitted SSML tags (generate_voiceover.pyline 79) — see inline comment. The iterativestr.replaceloop can match inside XML markup that was inserted by earlier iterations. Using a single-passre.subover all acronyms at once eliminates the problem. -
embed_audio.pysilently succeeds on per-slide failures (embed_audio.pylines 120–127) — see inline comment. Theelsebranch logs the error butmain()always returnsEXIT_SUCCESS. Add afailed_countvariable and returnEXIT_FAILUREwhen any slide fails.
📋 Coding Standards
-
Missing
KeyboardInterrupt/BrokenPipeErrorhandling inmain()for bothgenerate_voiceover.py(line 185) andembed_audio.py(line 86) — see inline comment. Required bypython-script.instructions.md. -
Missing PowerShell main execution guard in both
Invoke-EmbedAudio.ps1(line 56) andInvoke-GenerateVoiceover.ps1(line 77) — see inline comments. Required bypowershell.instructions.mdto support dot-sourcing for Pester tests. -
Missing
maturity: experimentalinexperimental.collection.yml(line 20) andhve-core-all.collection.yml(line 548) — see inline comments. Omitting the field causes both collection manifests to implicitly mark this experimental skill asstable. After adding the field, re-runnpm run plugin:generate.
i️ Minor Observations (non-blocking)
TtsVoiceoverHelpers.psm1:Export-ModuleMemberuses comma-separated function names rather than the@('F1', 'F2', 'F3')array form preferred bypowershell.instructions.md. Low priority.plugins/hve-core-all/README.mdadds amcp-security.instructionsrow. The PR description says this entry was removed fromhve-core-all.collection.yml; the plugin README shows it being added. This looks like the collection YAML onmainalready contained the entry but the plugin outputs hadn't been regenerated until this PR. Not a blocking issue, but worth confirming intent.
✅ Action Items
- Escape
voice,rate, and all SSML-interpolated user-controlled strings withxml.sax.saxutils.escape. - Replace the iterative
text.replaceloop inapply_acronym_aliaseswith a single-passre.sub. - Track
failed_countinembed_audio.pyand returnEXIT_FAILUREwhen any slide fails. - Add
KeyboardInterrupt/BrokenPipeErrorhandling tomain()in both Python scripts. - Wrap the main execution blocks in
Invoke-EmbedAudio.ps1andInvoke-GenerateVoiceover.ps1withif ($MyInvocation.InvocationName -ne '.') { ... }. - Add
maturity: experimentalto both collection YAML entries and re-runnpm run plugin:generate. - Check the Script or automation checkbox in the PR description's Type of Change section.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneThere was a problem hiding this comment.
Review Summary
This PR adds a well-structured tts-voiceover skill with solid documentation, cross-platform wrappers, and a proper fuzz harness. The implementation covers Entra ID token refresh, SSML dry-run mode, and the content directory integration with the PowerPoint skill. Two security issues must be resolved before merge; the remaining findings are convention violations from the repository's coding standards.
🔒 Issue Alignment
Fixes #1417 — the PR implements the TTS voice-over generation feature described. The stated goals (WAV-per-slide, SSML acronym control, Entra ID auth, dry-run mode) are all present. No scope creep or missing requirements observed.
⚠️ PR Template Compliance
- Missing "Script/automation" checkbox — The PR adds
.py,.sh, and.ps1files but only the "Copilot skill" checkbox was checked under Type of Change. TheScript/automation (.ps1, .sh, .py)checkbox should also be checked per the template.
🔒 Coding Standards — Security
| # | File | Issue | Severity |
|---|---|---|---|
| 1 | scripts/generate_voiceover.py:88–89 |
voice and rate attribute values are not XML-escaped in wrap_ssml; the PR note says "Fixed SSML XML injection" but the fix only covered the text body |
Medium |
| 2 | pyproject.toml:5–10 |
All four production dependencies have no version lower bounds — supply chain risk if a breaking or malicious release ships | Medium |
💡 Coding Standards — Conventions
| # | File | Issue |
|---|---|---|
| 3 | Invoke-EmbedAudio.ps1:70 |
$args_ violates PascalCase; use $PythonArgs |
| 4 | Invoke-GenerateVoiceover.ps1:91 |
Same $args_ naming issue |
| 5 | Invoke-EmbedAudio.ps1, Invoke-GenerateVoiceover.ps1 |
Missing main execution invocation guard (if ($MyInvocation.InvocationName -ne '.')) and missing #endregion Main labels |
| 6 | TtsVoiceoverHelpers.psm1:1–3 |
Module-level # ModuleName.psm1 / # Purpose: ... comments missing after copyright header |
| 7 | embed_audio.py:85, generate_voiceover.py:~184 |
main() lacks KeyboardInterrupt (exit 130) and BrokenPipeError handling |
💡 Minor Observations (non-blocking)
hve-core-all.collection.ymladds the skill withoutmaturity: experimental. Per project conventions, omittingmaturitydefaults tostable. Given this is an experimental skill, consider addingmaturity: experimentalfor accurate classification — consistent with other entries in the file that do declare their maturity.
✅ Strengths
- Fuzz harness correctly implements the polyglot pytest/Atheris pattern with corpus files and proper
FUZZINGguard. pyproject.tomlfollows all required sections ([tool.ruff],[tool.pytest.ini_options],fuzzgroup,python_files).xml.sax.saxutils.escapeis applied to speaker notes text before SSML wrapping — good; just needs to be extended to the attribute values.- Cross-platform wrappers (bash + PowerShell) are both present with matching feature parity.
- Entra ID token refresh logic (5-minute pre-expiry window) is correctly implemented.
🎯 Required Actions
- Fix XML attribute injection in
wrap_ssml— usexml.sax.saxutils.quoteattr()forvoiceandrate. - Add version lower bounds to all four production dependencies in
pyproject.toml. - Check "Script/automation" in the PR Type of Change section.
- Address the convention findings (PowerShell naming, invocation guard, module comments, Python error handling) as they are straightforward mechanical fixes.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneThere was a problem hiding this comment.
PR Review: feat(skills): add tts-voiceover skill
The implementation is well-structured overall — the Python scripts follow most conventions, the cross-platform wrapper pattern is solid, the fuzz harness is thorough, and the SSML-escaping of speaker notes is a good security call. However, several issues require resolution before merge.
🔗 Issue Alignment
The PR links Fixes #1417. The description and implementation are internally consistent and the scope appears appropriately bounded to a new skill. No scope creep detected.
📋 PR Template Compliance
The PR adds .ps1, .sh, and .py scripts (7 files). The "Script/automation (.ps1, .sh, .py)" checkbox under the Other section of Type of Change must also be checked. The template says "Select all that apply."
All other template sections (description, related issue, sample prompts, testing, checklist) are filled correctly.
🧑💻 Coding Standards
Python (python-script.instructions.md)
| Finding | File | Severity |
|---|---|---|
voice and rate inserted into SSML XML without escaping |
generate_voiceover.py L88–89 |
|
main() missing KeyboardInterrupt / BrokenPipeError handler |
embed_audio.py L77, generate_voiceover.py L161 |
💡 Low |
PowerShell (powershell.instructions.md)
| Finding | File | Severity |
|---|---|---|
Module missing # TtsVoiceoverHelpers.psm1 / # Purpose: header comment; Export-ModuleMember uses comma list not array |
TtsVoiceoverHelpers.psm1 L77 |
💡 Low |
Scripts missing main execution guard (if ($MyInvocation.InvocationName -ne '.')) |
Invoke-EmbedAudio.ps1 L80, Invoke-GenerateVoiceover.ps1 |
💡 Low |
Skill file (prompt-builder.instructions.md)
| Finding | File | Severity |
|---|---|---|
Copilot attribution line appended after the required > Brought to you by footer; the footer must be the last content line |
SKILL.md L195 |
💡 Low |
🔒 Code Quality and Security
pytest-cov>=7.0 — unresolvable version constraint (pyproject.toml L17)
pytest-cov has not reached version 7.x. This constraint will fail uv sync for anyone setting up the development environment. Use >=5.0 or >=6.0.
SSML injection (generate_voiceover.py L88–89)
The voice and rate argument values are interpolated directly into the XML template. The PR already correctly XML-escapes speaker notes; the same xml.sax.saxutils.escape treatment must be applied to voice and rate. See the inline comment for the suggested fix.
maturity: experimental missing (hve-core-all.collection.yml L548)
Without the explicit field, the collection system defaults maturity to stable, which misclassifies a skill in the experimental track. See the inline comment.
✅ No Issues Found In
- Bash script conventions (shebang, copyright,
set -euo pipefail,err(),usage(), quoting) pyproject.tomlstructure (ruff, pytest, fuzz group,python_files)- Fuzz harness design and pytest mode compatibility
apply_acronym_aliaseslongest-match ordering- Entra ID token refresh logic (5-minute pre-expiry window)
embed_audio.pyfunctional correctness- Collection YAML additions and plugin regeneration
- cspell dictionary update
📌 Required Actions Before Merge
- Check the Script/automation checkbox in the PR description.
- XML-escape
voiceandrateinwrap_ssml(see inline comment ongenerate_voiceover.pyL88–89). - Fix
pytest-cov>=7.0→>=5.0or>=6.0inpyproject.toml. - Add
maturity: experimentalto thetts-voiceoverentry inhve-core-all.collection.yml. - Remove the extra Copilot attribution line from
SKILL.md(footer must be the last line). - Add the purpose comment header to
TtsVoiceoverHelpers.psm1; switchExport-ModuleMemberto array syntax. - Wrap main execution in the invocation guard in both
Invoke-EmbedAudio.ps1andInvoke-GenerateVoiceover.ps1. - Add
KeyboardInterrupt/BrokenPipeErrorhandling tomain()in both Python scripts.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneThere was a problem hiding this comment.
Automated Quality Review — feat(skills): add tts-voiceover skill
The skill is well-structured and the overall implementation is clean. However, several blocking issues — security bugs, a correctness bug, and convention violations — need to be addressed before merge. The open review threads below summarise the required changes; each item links back to the thread where inline detail is captured.
🔒 Security
[Open thread] SSML attribute injection — voice and rate not XML-escaped in wrap_ssml
generate_voiceover.py lines 75–85: --voice and --rate CLI values are interpolated directly into XML attribute positions without xml.sax.saxutils.escape(). The PR description explicitly states "Fixed SSML XML injection", but the fix is incomplete — only speaker_notes were escaped. A value such as --voice 'foo"bar' produces malformed SSML and could be leveraged if this function is ever called with data from less-trusted sources.
[Open thread] SSML alias double-quoting — alias attribute not double-quote–escaped
apply_acronym_aliases escapes &, <, > via xml.sax.saxutils.escape() but the {'"': """} mapping needs to be passed consistently and verified against a lexicon entry that itself contains a double-quote character.
🐛 Correctness Bugs
[Open thread] text.replace() corrupts already-inserted SSML tags
After the first acronym replacement, the output string contains raw SSML markup (<sub alias="...">...</sub>). Subsequent text.replace() calls operate on this augmented string and can partially re-substitute text inside generated alias attribute values, producing malformed or nested SSML. A safe approach is to replace on the original text and track offsets, or use regex \b word-boundary matching against the pre-substituted string.
[Open thread] embed_audio.py::main() always returns EXIT_SUCCESS on embedding failure
When embed_slide_audio() returns False the error is logged but the function still returns EXIT_SUCCESS. Callers (including the bash/PowerShell wrappers) cannot detect partial failures; partially-narrated decks are silently saved.
📦 Collection / Metadata
[Open thread] Missing maturity: experimental on both collection entries
Both collections/experimental.collection.yml and collections/hve-core-all.collection.yml omit the maturity field. Per collection conventions, omitting it defaults to stable, which misrepresents a skill under experimental/. Both entries need maturity: experimental.
[New finding — inline comment] Unused lxml production dependency
pyproject.toml line 8 declares lxml>=6.1.0 but neither Python script imports it. This should be removed (or moved to a dev group if kept for future use). See the inline comment for details.
🔧 Convention Violations
[Open thread] Missing KeyboardInterrupt / BrokenPipeError handling — both Python scripts
Per python-script.instructions.md, main() must catch KeyboardInterrupt (exit 130) and BrokenPipeError at the top level. Neither generate_voiceover.py nor embed_audio.py does this.
[Open thread] Missing main execution guard — both PowerShell wrappers
Invoke-GenerateVoiceover.ps1 and Invoke-EmbedAudio.ps1 are missing the if ($MyInvocation.InvocationName -ne '.') { ... } guard required by powershell.instructions.md. Without it the scripts cannot be safely dot-sourced by Pester tests.
[Open thread] Conditionally assigned variables used outside their conditional block
speech_key, speech_region, and speech_resource_id in generate_voiceover.py are assigned only inside if not args.dry_run: but are referenced (for the Entra ID refresh check) in the loop body that runs regardless of execution path. Static analysers will flag these as potentially-undefined.
[Open thread] Unversioned production dependencies
All four runtime deps in pyproject.toml specify only a lower-bound (e.g. >=1.41). Combined with a uv.lock that is checked in, this means a regenerated lock file could pull breaking versions. Follow the existing skill pattern of constraining to a compatible range (e.g. >=1.41,<2).
📋 PR Template
[Open thread] "Script or automation" checkbox unchecked
The PR adds .py, .ps1, and .sh files matching the .*\.(ps1|sh|py)$ change-type pattern. The "Script/automation" checkbox under "Other" in the Type of Change section should be checked.
✅ Passing
- Frontmatter schema valid;
nameanddescriptionpresent. - Copyright headers (
# Copyright (c) Microsoft Corporation.+ SPDX) present on all scripts. set -euo pipefailin both bash wrappers.- Parallel bash/PowerShell wrappers provided as required by skill conventions.
- Fuzz harness covers
apply_acronym_aliases,wrap_ssml, andload_acronymswith property-based tests. pyproject.tomlincludes required[tool.ruff],[tool.ruff.lint],[tool.pytest.ini_options], andpython_files = ["test_*.py", "fuzz_harness.py"]sections.- Collection YAML and plugin outputs updated.
Action items (must address before merge):
- Escape
voiceandratewithxml.sax.saxutils.escape()inwrap_ssml. - Fix
apply_acronym_aliasesto avoid re-substituting already-generated SSML markup. - Return a non-zero exit code from
embed_audio.py::main()when any embedding fails. - Add
maturity: experimentalto both collection entries. - Remove unused
lxmlfrompyproject.tomlproduction dependencies. - Add
KeyboardInterrupt/BrokenPipeErrorhandling to both Pythonmain()functions. - Wrap
#region Maininif ($MyInvocation.InvocationName -ne '.')in both PS1 wrappers. - Narrow dependency version upper-bounds or document why open ranges are intentional.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneThere was a problem hiding this comment.
PR Review — feat(skills): add tts-voiceover skill
The implementation is thoughtful and largely well-structured — the SSML injection fix, Entra ID token refresh logic, polyglot fuzz harness, and cross-platform wrappers all reflect careful engineering. However there are several required fixes before this can merge.
Issue Alignment
Linked issue #1417 — the changes in this PR appear aligned with adding a TTS voice-over skill using Azure Speech SDK. One concern: the hve-core-all.collection.yml changes promote ~10 unrelated artifacts from experimental to stable maturity, which goes well beyond the stated scope of this issue. See the inline comment on that file.
PR Template Compliance ⚠️
The Type of Change section is incomplete. The following boxes must be checked to match what was actually changed:
| Missing checkbox | Evidence |
|---|---|
GitHub Actions workflow |
.github/workflows/dependency-review.yml was modified (added certifi and charset-normalizer to the allow-list) |
Script/automation (.ps1, .sh, .py) |
2 PowerShell scripts, 2 bash scripts, and 2 Python scripts were added |
The current description only checks "Copilot skill," which is incomplete given the full scope of changes.
Coding Standards ⚠️
PowerShell scripts (Invoke-EmbedAudio.ps1, Invoke-GenerateVoiceover.ps1):
- Body of the
if ($MyInvocation.InvocationName -ne '.')invocation guard is not indented — violates the standard structure from powershell-script.instructions.md. - Missing
try/catchblock with explicitexit 0/exit 1codes in the main execution guard.
See the inline comment onInvoke-EmbedAudio.ps1line 60 for a corrected template (applies equally toInvoke-GenerateVoiceover.ps1).
SKILL.md — non-standard attribution footer (see inline comment on line 195).
Code Quality ✅
The Python code is clean:
- SSML injection is correctly mitigated: speaker notes are XML-escaped before acronym substitution (
xml.sax.saxutils.escape), andquoteattris used for voice/rate attributes. - The polyglot fuzz harness correctly guards atheris-dependent code behind
if __name__ == "__main__" and FUZZING:so pytest invocations stay clean. - Token refresh logic for Entra ID auth is sound — the
speech_resource_id is Noneshort-circuit preventsNameErroronspeechsdkin dry-run mode. - Bash wrappers follow the conventions (strict mode, proper quoting,
main "$@"pattern, copyright headers).
The dependency-review.yml change is justified — certifi (MPL-2.0) and charset-normalizer (LGPL for detection code) are transitive dependencies of azure-* packages and the comments explain the rationale.
Scope Concerns ⚠️
Two changes in this PR are outside the stated scope and should either be documented or moved to separate PRs:
hve-core-all.collection.yml: ~10 unrelated artifacts are promoted fromexperimentalto stable. See inline comment.experimental.collection.yml:vscode-playwrightskill is added to the collection without mention in the PR description. See inline comment.
Required Actions
- ☐ Check
GitHub Actions workflowunder Type of Change in the PR description. - ☐ Check
Script/automationunder Type of Change in the PR description. - ☐ Either remove the unrelated
maturity: experimentalremovals fromhve-core-all.collection.ymland put them in a separate PR, or explicitly document and justify them in this PR. - ☐ Address the
vscode-playwrightaddition toexperimental.collection.yml— either document it or remove it. - ☐ Fix indentation and add
try/catch+ explicit exit codes in both PowerShell scripts (see inline comment onInvoke-EmbedAudio.ps1line 60). - ☐ Remove the non-standard
*🤖 Crafted with precision...*footer line fromSKILL.md(see inline comment on line 195).
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneThere was a problem hiding this comment.
Review Summary
The tts-voiceover skill is well-structured overall: the Python implementation correctly escapes SSML injection vectors, the cross-platform wrapper pair (bash + PowerShell) is provided as required, the fuzz harness is present, and collection manifests are updated. However, there are PR template compliance gaps and several code quality issues that must be addressed before this is ready to merge.
Issue Alignment
Issue #1417 could not be read directly, but the PR description is detailed and coherent with the declared intent of adding an Azure Speech SDK TTS skill. No misalignment is apparent from the description alone.
PR Template Compliance
Required fixes:
-
❌
npm run lint:md-linksis unchecked — marked "(deferred to CI)" but the PR template lists this as a required automated check that must pass before merging. Either run it locally and check the box, or link to a passing CI run that confirms zero broken links. -
❌ "Script/automation" Type of Change checkbox not checked — the PR adds
.py,.sh, and.ps1files. The template's "Other" section includes* [ ] Script/automation (.ps1, .sh, .py), which should be checked alongside the Copilot skill checkbox. -
⚠️ "Reviewed contribution withprompt-builderagent" removed from Type of Change — this item was present in the template's AI Artifacts section but is absent from the PR body's Type of Change section. It appears in the Checklist as unchecked with an N/A note; either restore it to the Type of Change section or confirm the N/A rationale applies there too.
Coding Standards
-
❌ SKILL.md attribution footer is not the last line —
prompt-builder.instructions.mdrequires> Brought to you by microsoft/hve-coreto be "the last line of body content." The*🤖 Crafted...*line appears after it. See inline comment on line 195. -
❌ PowerShell guard block missing indentation — in both
Invoke-GenerateVoiceover.ps1andInvoke-EmbedAudio.ps1, the body of theif ($MyInvocation.InvocationName -ne '.')guard is not indented, inconsistent with repository PowerShell conventions. See inline comment onInvoke-GenerateVoiceover.ps1line 83. -
⚠️ [OutputType([void])]is not a valid PowerShell type —voidis not a recognized .NET type identifier; use[System.Void]or omit the attribute entirely for functions that produce no pipeline output. See inline comment onTtsVoiceoverHelpers.psm1line 50.
Code Quality
-
⚠️ Acronym key matching silently fails after XML escaping —safe_notesis XML-escaped before being passed toapply_acronym_aliases. Any custom lexicon key containing&,<, or>will never match because the text now holds&,<,>. The built-in defaults are unaffected, but the contract is undocumented and fragile for user-supplied lexicons. See inline comment ongenerate_voiceover.pyline 73. -
💡
speechsdkimported twice —azure.cognitiveservices.speechis imported inside bothgenerate_audio()(line 105) and_run()(inside theif not args.dry_run:block). Python caches module imports so this is harmless, but the redundant import is unnecessary. Consider hoisting the import to the top of_run()and passing the module through, or importing lazily in one place only.
Action Items
- Run
npm run lint:md-linksand mark the checkbox (or link a passing CI run). - Check the "Script/automation" box under Type of Change.
- Remove the extra
*🤖 Crafted...*line from SKILL.md (or move it above the standard footer). - Indent the body of the
InvocationNameguard block in both PowerShell wrapper scripts. - Fix
[OutputType([void])]→[System.Void]or omit it inTtsVoiceoverHelpers.psm1. - Document the XML pre-escaping constraint in
apply_acronym_aliases's docstring so future contributors understand the lexicon key restriction.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneThere was a problem hiding this comment.
Review Summary
This PR adds a well-structured tts-voiceover skill with cross-platform scripts, a fuzz harness, and proper collection registration. The overall approach is solid — SSML injection is addressed, token refresh logic is present, and the dry-run mode is a good usability touch. Two code correctness issues and two PR template gaps need to be resolved before merge.
Issue Alignment
The linked issue (#1417) could not be read (integrity filter), so full alignment cannot be confirmed. Based on the PR description, the implementation scope appears coherent with the stated feature: YAML speaker notes → SSML → WAV → PPTX embedding.
PR Template Compliance
Two Type of Change checkboxes are missing despite the corresponding files being present in the diff:
- "Script/automation (
.ps1,.sh,.py)" — The PR addsgenerate_voiceover.py,embed_audio.py,fuzz_harness.py, two bash wrappers, and two PowerShell wrappers. This checkbox must be checked. - "GitHub Actions workflow" —
.github/workflows/dependency-review.ymlis modified to allowlist new PyPI packages. This checkbox must be checked.
Please update the PR description to reflect these change types.
Code Quality Findings
🐛 Critical — AttributeError on empty content.yaml (inline comment on line 267–268)
yaml.safe_load() returns None for an empty file; calling .get() on None raises AttributeError and aborts the entire run. A yaml.YAMLError from malformed YAML is also unhandled. See the inline comment for a suggested per-slide try/except + isinstance guard.
⚠️ Convention — Deferred import inside generate_audio() (inline comment on line 112)
Per python-script.instructions.md, imports belong at module level. _make_entra_config correctly receives speechsdk as a parameter; generate_audio should follow the same pattern. See the inline comment for the suggested fix.
Coding Standards
- ✅ Copyright headers present on all Python, bash, and PowerShell files.
- ✅
set -euo pipefailon both bash wrappers. - ✅
$ErrorActionPreference = 'Stop'on both PowerShell scripts. - ✅
yaml.safe_load(notyaml.load) used throughout. - ✅
xml.sax.saxutils.escapeapplied before acronym substitution — SSML injection is properly mitigated. - ✅
pyproject.tomlincludes all required sections ([tool.ruff],[tool.pytest.ini_options],fuzzdependency group,fuzz_harness.pyinpython_files). ⚠️ lxmlpinned as a direct dependency but not imported directly — see inline comment onpyproject.toml:8for a clarification ask (non-blocking if intentional).
Action Items
- ☑️ Check "Script/automation" and "GitHub Actions workflow" in the Type of Change section.
- 🐛 Fix the
AttributeError/YAMLErrorin thecontent.yamlparsing loop (line 267–268). ⚠️ Move theazure.cognitiveservices.speechimport out ofgenerate_audio()by passingspeechsdkas a parameter (line 112).- 💡 Add a comment to
pyproject.tomlexplaining thelxmlexplicit pin, or remove it if it is purely transitive.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneThere was a problem hiding this comment.
PR Review — feat(skills): add tts-voiceover skill
The implementation is well-structured: XML injection is correctly mitigated, Entra ID token refresh is implemented, the fuzz harness covers the right pure functions, and the cross-platform wrapper pattern (bash + PowerShell) is followed consistently. Two issues require changes before merge.
❌ Issue Alignment
Linked issue #1417 could not be read due to access restrictions. The PR description provides enough context to evaluate scope and intent.
collections/experimental.collection.yml adds a vscode-playwright entry that is never mentioned in the PR description, Additional Notes, or commit history. This is an undisclosed change. Please remove it or explicitly document and justify it.
❌ PR Template Compliance
The Type of Change section is incomplete. The following checkboxes apply to this PR's actual changes and must be checked:
| Checkbox | Reason |
|---|---|
GitHub Actions workflow |
.github/workflows/dependency-review.yml is modified (adding 4 lines for new PyPI license allowances) |
Script/automation (.ps1, .sh, .py) |
5 script files are added: generate_voiceover.py, embed_audio.py, generate-voiceover.sh, embed-audio.sh, Invoke-GenerateVoiceover.ps1, Invoke-EmbedAudio.ps1 |
Checking only Copilot skill while omitting these applicable categories makes it harder for reviewers to quickly assess the change surface area.
✅ Coding Standards
Files are broadly compliant with the applicable instruction files:
- Python (
python-script.instructions.md,python-test.instructions.md): Copyright headers ✅, SPDX identifiers ✅,pathlib.Paththroughout ✅,create_parser()extracted ✅, proper exit codes ✅,logging(notprint) for diagnostics ✅, fuzz harness with Atheris polyglot pattern ✅,ruffin dev deps ✅. - PowerShell (
powershell.instructions.md):#Requires -Version 7.0✅,[CmdletBinding()]✅,$ErrorActionPreference = 'Stop'✅, invocation guard ✅, comment-based help ✅,Export-ModuleMember✅. - Bash (
bash.instructions.md):set -euo pipefail✅,#!/usr/bin/env bash✅, copyright headers ✅,main()pattern ✅,[[ ]]tests ✅,err()helper ✅. - SKILL.md (
prompt-builder.instructions.md): Required sections present in correct order ✅, attribution suffix indescription:✅, attribution footer ✅.
✅ Code Quality and Security
- XML injection: Notes are correctly escaped with
xml.sax.saxutils.escape()before acronym substitution. Voice/rate attributes are protected withquoteattr(). ✅ - Credentials: Read exclusively from environment variables; no secrets in code. ✅
- Entra ID token refresh:
token_expires_at - 300guard is correct; the initial token is fetched before the loop so thetoken_expires_at = 0default is never used in the auth comparison. ✅ - Path traversal:
--lexiconreads a user-specified local file, which is expected CLI behavior. ✅ - Error handling: Both Python scripts handle
KeyboardInterrupt,BrokenPipeError, and use proper exit codes. ✅ - Dependency review allowances (
certifiMPL-2.0,charset-normalizerLGPL): Justification comments are accurate; these are transitive deps ofazure-identity. ✅
💡 Non-Blocking Suggestions
See inline comments for:
- Hardcoded
xml:lang="en-US"inwrap_ssml— limits non-English voice support (informational for thisexperimental-maturity release). - Missing
--regionCLI flag — asymmetric with other configurable parameters. - Missing
descriptioninpyproject.toml— PEP 621 completeness.
🔧 Required Actions
- Remove or explain the
vscode-playwrightentry inexperimental.collection.yml. - Check
GitHub Actions workflowandScript/automationin the Type of Change section.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | none
WilliamBerryiii
left a comment
There was a problem hiding this comment.
Approving with some requested changes/additions.
katriendg
left a comment
There was a problem hiding this comment.
Thanks, this is a really interesting skill in combination with the powerpoint one especially.
Leaving a few notes I recommend addressing, but approving overall, given you will close all Copilot other reviewer notes before merging.
Looking forward to trying it out!
There was a problem hiding this comment.
PR Review — feat(skills): add tts-voiceover skill
The contribution is well-structured overall: the SKILL.md is complete, the fuzz harness is a nice addition, the XML-injection fix for speaker notes (escape-before-alias-substitution) is correct, and the Entra ID token-refresh logic is thoughtfully implemented. However, one critical runtime bug blocks merge, a security hardening gap should be addressed, and the PR template has several unchecked boxes that are required.
🔴 Issue Alignment
PR links Fixes #1417. The implementation directly addresses the stated goal (per-slide TTS voice-over from YAML speaker notes). No scope creep was detected.
🔴 PR Template Compliance — Missing Type-of-Change Checkboxes
The PR only checked "Copilot skill" under AI Artifacts, but the diff contains multiple additional change types that the template requires to be declared:
| Changed content | Required checkbox (not checked) |
|---|---|
docs/getting-started/tts-voiceover.md (new file, 145 lines) |
Documentation update |
.github/workflows/dependency-review.yml (4 additions) |
GitHub Actions workflow |
scripts/*.py, scripts/*.sh, scripts/*.ps1 (new files) |
Script/automation (.ps1, .sh, .py) |
| New skill adding functionality not present before | New feature |
The "Reviewed contribution with prompt-builder agent" checkbox under AI Artifacts in the Type of Change section is also unchecked with no inline N/A note in that section (the N/A rationale appears only in the Checklist section). While the N/A reasoning may be acceptable, the omission of the other four checkboxes is not.
🔴 Coding Standards — Critical Bug
See inline comment on generate_voiceover.py line 296.
generate_audio(ssml, wav_path, speech_config, speechsdk) passes 4 arguments to a function that only accepts 3. Every non-dry-run invocation raises:
TypeError: generate_audio() takes 3 positional arguments but 4 were given
Fix: remove speechsdk from the call — the function imports it internally.
This was not caught by the existing test suite because the fuzz harness only targets apply_acronym_aliases, wrap_ssml, and load_acronyms, and there are no integration tests that exercise main() without dry-run mode.
⚠️ Security — XML Injection via --voice / --rate
See inline comment on generate_voiceover.py lines 86–87.
voice and rate are interpolated into SSML attribute values without XML escaping. Apply xml.sax.saxutils.escape() to both values before interpolation.
💡 Minor Findings
pyproject.toml line 8 — misleading comment (see inline comment): lxml is a direct dependency here; the comment saying "transitive dep of python-pptx" is inaccurate.
_make_entra_config return type (generate_voiceover.py line 120): the return annotation is tuple (unparameterized). Per python-script.instructions.md, prefer explicit generic types. Suggest tuple[object, float] or more precisely tuple[speechsdk.SpeechConfig, float] (the latter requires a forward-reference or a local import at module level).
docs/getting-started/tts-voiceover.md — cross-tree link: The Related Resources link ../../.github/skills/experimental/tts-voiceover/SKILL.md points outside the docs/ tree. This will be a dead link in the Docusaurus build. Consider linking to the GitHub URL or using an absolute path that Docusaurus can resolve, or documenting in Troubleshooting instead.
Action Items
- [Required] Remove the extra
speechsdkargument from thegenerate_audio(...)call (line 296). - [Required] Escape
voiceandrateinwrap_ssmlwithxml.sax.saxutils.escape(). - [Required] Check the missing Type-of-Change boxes: New feature, Documentation update, GitHub Actions workflow, Script/automation.
- [Suggested] Fix the misleading
lxmlcomment inpyproject.toml. - [Suggested] Parameterize the
tuplereturn type on_make_entra_config. - [Suggested] Fix or remove the cross-tree relative link in
docs/getting-started/tts-voiceover.md.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneThere was a problem hiding this comment.
Advisory review, this PR is from a maintainer. Findings are informational only.
Review Overview
PR #1415 adds the tts-voiceover skill for Azure Speech SDK voice-over generation. The implementation is well-structured overall — cross-platform scripts, Pester tests, a fuzz harness, and collection registration are all present. There is one blocking structural defect (YAML corruption) and three code-quality findings in the Python script that are worth addressing before this ships.
1. Issue Alignment
Linked issue: #1417.
The issue could not be read directly due to access constraints, so full alignment cannot be confirmed. The PR description ("Adds a new skill for generating AI voice-overs from PowerPoint slides") is internally consistent and the implementation covers the described scope. Unable to verify that all acceptance criteria from the issue are met.
2. PR Template Compliance
.py ×2, .sh ×2, .ps1 ×2) but only checks the Copilot skill checkbox under "Type of Change". The template also has * [ ] Script/automation (.ps1, .sh, .py) which should be checked here.
Everything else in the template appears filled in correctly: description, related issue, testing notes, and checklist items are present.
3. Coding Standards
experimental.collection.yml — See inline comment at line 20–22. The YAML is corrupted at the powerpoint entry boundary; the tts-voiceover block was accidentally inlined onto the same line as powerpoint's kind field. This will fail YAML parsing and collection validation. Must be fixed before merge.
hve-core-all.collection.yml — No issues. The tts-voiceover entry is correct and maturity: experimental is properly scoped.
dependency-review.yml — The two new PyPI license allowances (certifi, charset-normalizer) look correct and follow the existing pattern.
PowerShell scripts and modules follow the copyright/#Requires/CmdletBinding/param structure from the instructions. Pester tests follow the expected header and Describe -Tag 'Unit' pattern.
Bash scripts include the copyright header and set -euo pipefail. ✅
4. Code Quality
Three findings in generate_voiceover.py are flagged via inline comments:
| # | Location | Severity | Summary |
|---|---|---|---|
| 1 | wrap_ssml lines 87–88 |
voice and rate injected into XML attribute values without xml.sax.saxutils.escape / quoteattr |
|
| 2 | apply_acronym_aliases lines 73–79 |
💡 Low | Successive replacement passes run on text already containing <sub> markup; custom lexicon entries that are substrings of prior aliases could corrupt SSML |
| 3 | _resolve_lexicon lines 136–141 |
💡 Low | Bare Path("acronyms.yaml") returned as a CWD-relative fallback; can silently load an unrelated file; returning None and branching in the caller is safer |
The PR description says "Fixed SSML XML injection" — that fix covers speaker_notes content escaping (which looks correct), but the voice and rate attribute values in wrap_ssml were not addressed in the same pass.
5. Action Items
- Fix
experimental.collection.yml— YAML corruption at lines 20–22 will break collection validation. (Blocking) - Escape
voiceandrateinwrap_ssml— Usexml.sax.saxutils.quoteattr()for attribute value positions. - Tick the
Script/automationcheckbox in the PR description. - (Optional) Address the
_resolve_lexiconCWD fallback and theapply_acronym_aliasesre-scan risk if a follow-up polish pass is planned.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneThere was a problem hiding this comment.
This PR introduces a well-structured tts-voiceover skill with solid Python foundations, a polyglot fuzz harness, and cross-platform wrappers. Two blocking issues must be resolved before merge, along with several convention and quality improvements detailed in the inline comments below.
Issue Alignment
Links to Fixes #1417. The PR description and diff are consistent with a new TTS skill capability; the description, sample prompts, and output artifacts section are clear and actionable.
PR Template Compliance
The following items in the template are incorrect or incomplete:
- "Documentation update" is not checked —
docs/getting-started/tts-voiceover.mdwas added todocs/. - "GitHub Actions workflow" is not checked —
.github/workflows/dependency-review.ymlwas modified (new license allowlist entries). - "Script/automation (.ps1, .sh, .py)" is not checked — the PR adds
generate_voiceover.py,embed_audio.py,Invoke-GenerateVoiceover.ps1,Invoke-EmbedAudio.ps1,generate-voiceover.sh,embed-audio.sh, andTtsVoiceoverHelpers.psm1.
The following Required Automated Checks are incorrectly marked N/A:
lint:psis marked "N/A: no PowerShell files changed" — the PR adds four PowerShell files (Invoke-GenerateVoiceover.ps1,Invoke-EmbedAudio.ps1,TtsVoiceoverHelpers.psm1,TtsVoiceoverHelpers.Tests.ps1). Please runnpm run lint:psand update the checklist.docs:testis marked "N/A: no docs/ changes" —docs/getting-started/tts-voiceover.mdwas added. Please runnpm run docs:testand update the checklist.
Coding Standards
See inline comments for:
collections/experimental.collection.ymllines 20–21 — Blocker: malformed YAML (newline lost between entries causes a parse error; entry is also duplicated). Confirmed viayaml.safe_load().SKILL.mdline 4 —compatibility:is not part of the defined skill frontmatter schema; move to the Prerequisites section body.docs/getting-started/tts-voiceover.mdline 2 —sidebar_positionis required by Docusaurus conventions but is missing from the frontmatter.
Code Quality and Security
See inline comments for:
generate_voiceover.pylines 88–89 — Security:voiceandrateCLI args are interpolated into SSML XML attributes withoutxml.sax.saxutils.escape(). The PR states "Fixed SSML XML injection" but that fix only covers speaker notes. The same escape already imported and used for notes must be applied tovoiceandrate.generate_voiceover.pyline 79 —apply_acronym_aliasesusesstr.replace()without word boundaries, which can spuriously match acronyms mid-word (e.g.,"AST"inside"faster"). Switching tore.sub(r'\b' + re.escape(acronym) + r'\b', ...)addresses this.
Required Actions
- Fix
collections/experimental.collection.yml— the YAML is unparseable due to a missing newline between thepowerpointandtts-voiceoverentries (lines 20–21). The entry is also duplicated. This is a blocker that prevents the collection from loading. - Fix
wrap_ssmlingenerate_voiceover.py— applyxml.sax.saxutils.escape()tovoiceandratebefore XML interpolation to complete the XML injection fix described in the PR notes. - Correct PR template checkboxes — check "Documentation update", "GitHub Actions workflow", and "Script/automation". Mark
lint:psanddocs:testas run (not N/A) after executing them. - Remove the non-standard
compatibility:field fromSKILL.mdfrontmatter. - Add
sidebar_positiontodocs/getting-started/tts-voiceover.mdfrontmatter. - Address the word-boundary concern in
apply_acronym_aliases(medium priority, can be in a follow-up if preferred).
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneThere was a problem hiding this comment.
Code Review — feat(skills): add tts-voiceover skill
This is a well-structured addition. The implementation demonstrates strong security awareness (XML-escaped speaker notes before SSML processing, quoteattr for SSML attributes, proper Entra ID token refresh), good test coverage (28 Python tests + fuzz harness + Pester tests), and clean cross-platform wrappers. The collection, plugin, and documentation artifacts are all correctly updated.
✅ Issue Alignment
PR links Fixes #1417. The changes (new tts-voiceover skill with Azure Speech SDK TTS generation and PPTX audio embedding) are consistent with the stated description.
✅ PR Template Compliance
All required sections are populated, checkboxes match the actual change types, and the testing section is detailed with specific test counts and validation commands. The AI-artifact checklist N/A reasoning is valid (this skill ships Python scripts, not prompt/agent artifacts).
✅ Coding Standards
Conventions followed across Python, PowerShell, and Bash:
- Python files follow the prescribed organization (shebang, copyright, SPDX,
from __future__ import annotations, constants, module logger,create_parser,_run,main) - SSML injection protection is correctly implemented (XML-escape → acronym substitution ordering)
- Bash scripts include
set -euo pipefail,err(),usage(), and the main-function pattern - PowerShell modules and scripts include all required elements (copyright,
#Requires, comment-based help,[CmdletBinding()],$ErrorActionPreference = 'Stop', invocation guard) - Pester tests use
-Tag 'Unit',BeforeAll/AfterAll, and the correct#Requires -Modules Pesterheader ordering
Three minor suggestions are attached as inline comments (see below).
✅ Code Quality
- Dependency pin (
lxml>=6.1.0) with a CVE-patch rationale comment is a good practice dependency-review.ymllicense allowances forcertifi(MPL-2.0) andcharset-normalizer(MIT) are accurateembed_slide_audiocorrectly handles theNoneshape-id edge case (orphan removal + error log)- No secrets are logged; auth tokens are constructed in memory only
uv.lockcommitted for reproducible builds ✅
💡 Suggestions (non-blocking)
Three inline comments are attached:
generate_voiceover.pyline 328 — Replaceassert speech_resource_id is not Nonewith an explicit guard;assertis silently stripped by Python's-Oflag.embed_audio.pylines 247-250 — Theverboseparameter inconfigure_loggingis never passed aTruevalue (no--verboseCLI flag). Add the flag or remove the dead parameter; also consider aligning withgenerate_voiceover.pywhich omits the wrapper entirely.Invoke-EmbedAudio.ps1line 47 (andInvoke-GenerateVoiceover.ps1)— Optional parameters should use[Parameter(Mandatory = $false)]per the PowerShell instructions' requirement to declareMandatoryattributes explicitly.
No blocking issues found. The skill is ready for human review pending resolution of the suggestions above at the author's discretion.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | none- add --verbose/-v flag to embed_audio.py and wire to configure_logging - replace assert with RuntimeError for speech_resource_id type narrowing - add explicit Mandatory = $false on all optional PS1 parameters 🐛 - Generated by Copilot
There was a problem hiding this comment.
Automated PR Review
This PR adds the tts-voiceover experimental skill, which generates WAV audio from PowerPoint slide speaker notes via the Azure Speech SDK and embeds the audio into the deck using python-pptx. The implementation is well-structured, cross-platform (Python + PowerShell + bash), and includes comprehensive tests (28 total, with a fuzz harness for OSSF Scorecard compliance).
No blocking issues were found. Three minor coding-standard observations are captured as inline comments.
Issue Alignment
Fixes #1417, so the link is structurally correct. Human reviewers should confirm the PR fully addresses the issue requirements before merging.
PR Template Compliance
✅ All major template sections are filled in with substantive content.
💡 Minor deviation: The Type of Change → AI Artifact Contributions section does not check "Reviewed contribution with prompt-builder agent and addressed all feedback." The author explains in the Checklist section that this is N/A because the skill contains Python/PowerShell/bash scripts rather than .prompt.md, .agent.md, or .instructions.md artifacts. The reasoning is sound, but the checklist item should be annotated (N/A — skill contains executable scripts, not prompt artifacts) per the PR conventions rather than left unchecked without inline annotation.
Coding Standards
Three inline comments were posted. Summary:
-
generate_voiceover.py→main()(~line 374) — Thelogging.basicConfig()call should be extracted into aconfigure_logging(verbose: bool = False)function perpython-script.instructions.md(Script Organization, point 11). The siblingembed_audio.pyalready follows this convention. Adding a--verbose/-vflag would also bring the scripts into parity and is especially useful for diagnosing Azure SDK synthesis failures. -
embed_audio.py→_add_narration_timing()(~line 60) — Thexmlns:anamespace is declared in the timing XML string but thea:prefix is never used. The dead declaration should be removed. A forward-looking suggestion to use thelxml.etreebuilder (already imported) was also included. -
tests/test_embed_audio.py→ line 7 —from unittest.mock import MagicMockis imported and used formock_slide/mock_shapeobjects that configurereturn_value(spy behaviour). Perpython-test.instructions.md,mocker.MagicMock()should be used for spy/assert targets; the direct import is reserved for pure data stubs.
Code Quality and Security
✅ XML injection — Speaker notes are correctly passed through xml.sax.saxutils.escape() before SSML wrapping, and voice/rate attribute values use xml.sax.saxutils.quoteattr(). No injection surface found.
✅ Credential handling — SPEECH_KEY is never logged or echoed. Entra ID token refresh includes a 300-second buffer to prevent mid-run auth failures.
✅ Workflow change is safe — The dependency-review.yml modification only adds pkg:pypi/certifi and pkg:pypi/charset-normalizer to the license allow-list with explanatory comments. The pre-existing actions/checkout@de0fac2e... SHA was already on main and is not introduced by this PR.
✅ Error handling — Both scripts use EXIT_SUCCESS / EXIT_FAILURE / EXIT_ERROR constants, propagate KeyboardInterrupt as exit code 130, and handle BrokenPipeError. The PowerShell wrappers use $ErrorActionPreference = 'Stop' with try/catch and explicit exit codes.
✅ Collection manifest — tts-voiceover is correctly registered in collections/experimental.collection.yml with maturity: experimental, appropriate for a new skill.
Action Items
| # | File | Item | Priority |
|---|---|---|---|
| 1 | generate_voiceover.py |
Extract configure_logging() and add --verbose flag |
Suggestion |
| 2 | embed_audio.py |
Remove unused xmlns:a namespace declaration |
Suggestion |
| 3 | tests/test_embed_audio.py |
Replace direct MagicMock import with mocker.MagicMock() for spy targets |
Suggestion |
| 4 | PR description | Annotate the unchecked prompt-builder checkbox with (N/A — ...) |
Minor |
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | none…rate_voiceover 🔧 - Generated by Copilot
There was a problem hiding this comment.
PR Review: feat(skills): add tts-voiceover skill
Overall this is a high-quality, well-structured addition. The implementation is secure, follows codebase conventions throughout, ships with comprehensive test coverage (28 tests including fuzz harness), and handles both key-based and Entra ID authentication cleanly.
Issue Alignment
Linked issue #1417 could not be read due to repository integrity policy, so issue alignment cannot be fully verified. However, the PR description is detailed and internally consistent — the stated purpose (Azure Speech SDK TTS skill with SSML acronym control, dry-run mode, and PPTX embedding) is clearly and completely implemented.
PR Template Compliance
✅ Description, Related Issue, Testing, Security Considerations, and Required Automated Checks sections are all filled in correctly.
* [ ] Reviewed contribution with prompt-builder agent and addressed all feedback under Type of Change → AI Artifacts, but this checkbox is absent from that section in the PR body. The author correctly addressed it with an N/A explanation in the Checklist → AI Artifact Contributions section instead. While the intent is clear, the checkbox should also appear (unchecked with inline N/A note) in its original template location so the Type of Change section matches the template structure.
Coding Standards
✅ generate_voiceover.py and embed_audio.py follow python-script.instructions.md conventions: PEP 723 entry points, argparse via create_parser(), logging throughout, pathlib.Path, EXIT_SUCCESS/FAILURE/ERROR constants, main() guard.
✅ TtsVoiceoverHelpers.psm1, Invoke-GenerateVoiceover.ps1, and Invoke-EmbedAudio.ps1 follow powershell.instructions.md: copyright headers, #Requires -Version 7.0, CmdletBinding, [OutputType], #region/#endregion, Export-ModuleMember.
✅ generate-voiceover.sh and embed-audio.sh follow bash.instructions.md: set -euo pipefail, copyright headers, err() helper, usage(), argument parsing via case.
✅ SKILL.md follows skill file conventions: required sections in order, parallel bash/PowerShell scripts, relative file references, attribution footer.
✅ pyproject.toml satisfies Python skill config requirements: [tool.ruff], [tool.ruff.lint], [tool.pytest.ini_options] with fuzz_harness.py discovery, fuzz dependency group with atheris>=3.0.
✅ Pester tests follow pester.instructions.md: #Requires -Modules Pester first, copyright after, -Tag 'Unit' on Describe blocks, BeforeAll/AfterAll, $TestDrive, module-scoped mocks.
Code Quality
✅ XML injection prevention is solid: xml.sax.saxutils.escape() on speaker notes before acronym substitution; xml.sax.saxutils.quoteattr() on voice/rate in wrap_ssml; integer shape_id/duration_ms safely interpolated into PPTX timing XML.
✅ @functools.lru_cache on _compile_acronym_pattern correctly uses a tuple[str, ...] key (hashable). Pattern is sorted longest-first to prevent prefix-shadowing in single-pass replacement.
✅ Entra ID token refresh (token_expires_at - 300) logic is correct. use_entra_auth accurately reflects the key-takes-precedence branching in _run.
✅ yaml.safe_load used throughout — no arbitrary code execution from YAML input.
embed_audio.py line 246: _run returns EXIT_FAILURE when embedded_count == 0 without emitting a summary error message, leaving users with only individual SKIP slide N INFO lines and no clear explanation of the failure.
Security
✅ SPEECH_KEY and SPEECH_RESOURCE_ID read exclusively from environment variables — no hardcoded credentials.
✅ load_acronyms uses yaml.safe_load and validates the parsed result type before accepting it.
✅ Dependency review workflow updated with correct license allowances and explanatory comments for new transitive dependencies (certifi MPL-2.0, charset-normalizer MIT).
Action Items
embed_audio.py:246— Add alogger.error(...)summary beforereturn EXIT_FAILUREwhenembedded_count == 0(see inline comment).- (Optional) Restore the
Reviewed contribution with prompt-builder agentcheckbox to the Type of Change → AI Artifacts section of the PR description, even if only to mark it unchecked with an inline N/A note, to keep the template structure intact.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | none🐛 - Generated by Copilot
There was a problem hiding this comment.
PR Review: feat(skills): add tts-voiceover skill
This is a well-structured, thorough addition. The Python implementation is clean and security-conscious (XML-escaping before SSML construction, Entra ID auth support, input-equals-output guard in embed_audio.py), the test suite follows repo conventions correctly, and the pyproject.toml is complete with all required sections ([tool.ruff], [tool.pytest.ini_options], fuzz dependency group). A few issues need addressing before merge.
📋 Issue Alignment
PR links Fixes #1417. The changes are self-consistent with the stated goal of adding a TTS voice-over skill for Azure Speech SDK. No scope creep detected; the workflow change (dependency-review.yml) is appropriately scoped to allowlisting transitive Azure SDK dependencies.
📄 PR Template Compliance
✅ All required template sections are filled in, checkboxes are checked and accurate, testing is described, and security considerations are documented.
📐 Coding Standards
markdown.instructions.md (applyTo: **/*.md) — applies to SKILL.md, docs/getting-started/tts-voiceover.md, and all other .md files in this PR:
"Try to always use
*for unordered lists. Avoid using-and+for unordered lists unless the file already uses these."
Both new .md files use - exclusively for bullet lists. See inline comments on SKILL.md line 19 and docs/getting-started/tts-voiceover.md line 19.
prompt-builder.instructions.md — skill attribution standard:
"Skill files also include a standard attribution footer as the last line of body content:
> Brought to you by organization/repository-name"
The correct footer is present, but a non-standard *🤖 Crafted with precision...* line precedes it in both SKILL.md (line 181) and the docs page (lines 153-154). In the docs page this line is only present because a <!-- markdownlint-disable MD036 --> suppression was added to silence the resulting linter warning — the fix is to remove the content, not the lint check. See inline comments.
Missing --verbose / -v documentation — the Parameters Reference tables in SKILL.md omit the verbose flag that both scripts support. See inline comment on line 77.
🔍 Code Quality
generate_voiceover.py line 290 — use_entra_auth is computed from speech_resource_id and speech_key, which are only assigned inside if not args.dry_run:. The expression silently evaluates to False during dry-run because both variables are None, which is correct but fragile. Moving the expression inside the conditional (or initializing the variables to None before it) makes the scoping relationship explicit. See inline comment.
🔒 Security
No issues found. Highlights:
SPEECH_KEYis read exclusively from environment variables — never from CLI args or files.- All user-supplied text is XML-escaped with
xml.sax.saxutils.escape()before SSML construction. - The SSML injection risk noted in the PR description is addressed.
- The input-equals-output guard in
embed_audio.py(line 206-210) prevents accidental in-place overwrite.
✅ What's Well Done
- Fuzz harness covers all three public pure functions (
apply_acronym_aliases,wrap_ssml,load_acronyms) in both Atheris and pytest modes. - Test suite correctly uses
mockerfixture (pytest-mock) for spy targets and directMagicMock()for stubs — aligned withpython-test.instructions.md. pyproject.tomlis complete:[tool.ruff],[tool.ruff.lint],[tool.pytest.ini_options]withpython_files = ["test_*.py", "fuzz_harness.py"], andfuzzdependency group withatheris>=3.0.- PowerShell scripts and module follow naming, copyright header,
CmdletBinding, andExport-ModuleMemberconventions. collections/experimental.collection.ymlcorrectly setsmaturity: experimental.
🛠️ Action Items
- Replace
-bullet markers with*throughoutSKILL.mdanddocs/getting-started/tts-voiceover.md. - Remove lines 181 from
SKILL.md(the*🤖 Crafted...*line). - Remove lines 152-154 from
docs/getting-started/tts-voiceover.md(themarkdownlint-disablecomment and Copilot meta-commentary). - Add
--verbose/-vrows to both parameters tables inSKILL.md. - Clarify
use_entra_authscoping ingenerate_voiceover.py(move insideif not args.dry_run:or initialize variables toNonebefore the block).
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneThere was a problem hiding this comment.
PR Review — feat(skills): add tts-voiceover skill
Overall this is a well-structured, thoughtfully implemented skill. The SSML injection fix, Entra ID token refresh logic, comprehensive test suite (28 tests including a fuzz harness), and thorough documentation all reflect high quality work. The inline comments below flag minor improvements only.
Issue Alignment
Issue #1417 is linked via Fixes #1417. The PR description clearly describes the delivered functionality (per-slide WAV generation, PPTX audio embedding, SSML acronym aliases, dry-run mode, dual auth). ✅
PR Template Compliance
| Item | Status |
|---|---|
| Description filled | ✅ |
| Related Issue linked | ✅ Fixes #1417 |
| Type of Change checked | ✅ New feature, Documentation, GitHub Actions workflow, Copilot skill, Script/automation |
| Workflow checkbox matches diff | ✅ dependency-review.yml modified |
| Skill checkbox matches diff | ✅ SKILL.md added |
| Testing section filled | ✅ 28 tests, all validation commands listed |
| Required Automated Checks | ✅ All checked |
| Security Considerations | ✅ All checked |
prompt-builder agent and addressed all feedback" checkbox. This item is entirely absent from the PR description (not unchecked with justification — removed). The equivalent items in the Checklist section are marked N/A with a reasonable justification ("skill contains Python scripts, not prompt/agent artifacts"), but the Type of Change section should mirror that coverage. Not blocking given the explicit Checklist N/A callout.
Coding Standards
Python (generate_voiceover.py, embed_audio.py):
- ✅ Follows
python-script.instructions.md: shebang, copyright headers, strict mode, entry point, exit codes, Google-style docstrings,Paththroughout,argparsewithcreate_parser(),configure_logging(),main()guard. - ✅
xml.sax.saxutils.escape()applied before acronym substitution correctly addresses the SSML injection vector. - ✅
xml.sax.saxutils.quoteattr()used for voice/rate attribute values inwrap_ssml. ⚠️ See inline comment ongenerate_voiceover.pyline ~338: unreachableNoneguard is a dead-code hazard; preferassertfor type-narrowing invariants.
PowerShell (Invoke-GenerateVoiceover.ps1, Invoke-EmbedAudio.ps1, TtsVoiceoverHelpers.psm1):
- ✅ Follows
powershell.instructions.md: copyright headers,#!/usr/bin/env pwsh,#Requires -Version 7.0, comment-based help,[CmdletBinding()],$ErrorActionPreference = 'Stop', main execution guard,#region/#endregion. - ✅
TtsVoiceoverHelpers.psm1has explicitExport-ModuleMember. ⚠️ See inline comments on both.ps1wrappers:-Verboseis not forwarded to Python's--verboseflag, leaving a usability gap for debugging.
Bash (generate-voiceover.sh, embed-audio.sh):
- ✅ Follows
bash.instructions.md:#!/usr/bin/env bash, copyright header,set -euo pipefail,err()helper,usage(),main()pattern. - ✅ Verbose pass-through is handled correctly via
passthrough_args— users can pass--verbosedirectly and it reaches Python.
Code Quality and Security
- ✅ No secrets hardcoded;
SPEECH_KEYandSPEECH_RESOURCE_IDread exclusively from environment variables with clear error messages. - ✅
dependency-review.ymladditions (certifiMPL-2.0,charset-normalizerMIT) are accurately commented and license-justified. - ✅
lxmlpinned explicitly with CVE justification comment inpyproject.toml. - ✅ Input path ≠ output path check in
embed_audio.pyprevents silent overwrite. - ✅
etree.fromstring()parses the hand-built timing XML on insertion;shape_idandduration_msare integers so no injection risk. - ✅ Entra ID token refresh (5 min before expiry) handles long-running deck processing correctly.
Action Items
generate_voiceover.py~line 338: Replace the unreachableif speech_resource_id is None: raise RuntimeError(...)withassert speech_resource_id is not Noneto communicate the invariant clearly (see inline comment).Invoke-GenerateVoiceover.ps1line ~111 andInvoke-EmbedAudio.ps1line ~82: Addif ($VerbosePreference -ne 'SilentlyContinue') { $PythonArgs += '--verbose' }so PowerShell's-Verbosepropagates to Python debug logging (see inline comments).- Template housekeeping (non-blocking): Retain the
prompt-builderreview checkbox in the Type of Change "AI Artifacts" section (left unchecked or marked N/A inline) to keep the template intact for future reviewers.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | none* SKILL.md: convert unordered list markers from - to * * SKILL.md: document --verbose/-v flag in both parameter tables * SKILL.md: remove non-standard Copilot meta-commentary line * docs/getting-started/tts-voiceover.md: remove markdownlint-disable and Copilot meta lines * generate_voiceover.py: scope use_entra_auth inside auth setup; replace dead None guard with assert * Invoke-EmbedAudio.ps1, Invoke-GenerateVoiceover.ps1: forward -Verbose to Python --verbose 🔧 - Generated by Copilot
There was a problem hiding this comment.
PR Review: feat(skills): add tts-voiceover skill for Azure Speech SDK voice-over generation
Overall this is a high-quality, well-structured contribution. The Python scripts follow repo conventions (entry points, exit codes, create_parser, configure_logging, type hints, pathlib.Path, yaml.safe_load, proper XML escaping). The PowerShell wrappers and Pester tests comply with repo standards. The fuzz harness, collection manifests, and documentation are all present and correct.
Issue Alignment
- Fixes #1417 is present. ✅
- The PR description, execution flow, and output artifacts are clearly documented. ✅
- No scope creep detected — changes are limited to the new skill, its collection registrations, dependency-review allowlist entries, a cspell entry, and a getting-started doc page. ✅
PR Template Compliance
- Description, Related Issue(s), Type of Change, Testing, and Checklist sections are all filled in. ✅
- The
prompt-builderreview checkbox is marked N/A with a clear justification (the skill ships Python scripts, not prompt/agent artifacts). This is acceptable. ✅ - All required automated check boxes are marked as passed. ✅
- Security Considerations section is complete. ✅
Coding Standards
- Python scripts follow
python-script.instructions.md: correct section ordering,from __future__ import annotations,argparseviacreate_parser(),configure_logging(),pathlib.Paththroughout,yaml.safe_load, Google-style docstrings,EXIT_SUCCESS/EXIT_FAILURE/EXIT_ERRORconstants. ✅ - PowerShell wrappers follow
powershell-script.instructions.md: copyright headers, shebang,#Requires -Version 7.0, comment-based help,[CmdletBinding()],$ErrorActionPreference = 'Stop',Export-ModuleMember. ✅ - Pester tests follow
pester-test.instructions.md:#Requires -Modules Pesterbefore copyright header,-Tag 'Unit'onDescribeblocks. ✅ - Bash scripts follow
bash-script.instructions.md: copyright headers,set -euo pipefail,main()pattern,lower_snake_casefunctions, error function. ✅ pyproject.tomlincludes[tool.ruff],[tool.ruff.lint],[tool.pytest.ini_options],python_files = ["test_*.py", "fuzz_harness.py"], and afuzzdependency group withatheris>=3.0— all required byvalidate:skills. ✅maturity: experimentalis set correctly in both collection YAMLs. ✅
Code Quality
Three minor suggestions are left as inline comments:
-
generate_voiceover.pyline ~333 — When bothSPEECH_KEYandSPEECH_RESOURCE_IDare set, key-based auth silently wins. A warning log would make the active credential path transparent. (Non-blocking) -
embed_audio.pyline ~90 —_add_narration_timingunconditionally removes any existingp:timingelement, which silently overwrites complex authored animations. A warning when non-trivial timing is detected would help users understand the trade-off. (Non-blocking) -
fuzz_harness.pyline ~43 — Fuzz target functions referenceatherisunconditionally but the import is conditional onFUZZING = True. The current code is safe (the fuzz dispatch path is guarded), but the latentNameErroris worth addressing for future maintainability. (Non-blocking)
Security
- XML injection is properly mitigated:
xml.sax.saxutils.escape()applied to speaker notes before acronym substitution;xml.sax.saxutils.quoteattr()used for voice/rate attributes. ✅ - Acronym alias values are also properly quoted via
quoteattr. ✅ - Credentials read exclusively from environment variables; no hardcoded tokens or keys. ✅
yaml.safe_loadused throughout. ✅- Entra ID token refresh uses
azure-identity'sDefaultAzureCredentialwith a 5-minute pre-expiry buffer. ✅ dependency-review.ymlallowlist additions (certifi,charset-normalizer) are justified as Azure SDK transitive dependencies with accurate SPDX comments. ✅
Action Items
None of the findings are blocking. The three inline comments are suggestions that would improve robustness and debuggability — the author can choose to address them or leave a rationale for the current approach.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | none* generate_voiceover.py: warn when both SPEECH_KEY and SPEECH_RESOURCE_ID set * embed_audio.py: warn when replacing slide timing with authored animations * fuzz_harness.py: guard fuzz target bodies with FUZZING check 🔧 - Generated by Copilot
There was a problem hiding this comment.
PR Review — feat(skills): add tts-voiceover skill
This review covers issue alignment, PR template compliance, coding standards (Python, PowerShell, Bash, Markdown), and code quality/security for all 29 changed files.
Issue Alignment
Linked issue #1417 could not be read due to repository policy restrictions on low-integrity resources. Based on the PR description, the implementation is coherent and self-consistent: it adds a TTS voice-over skill using Azure Speech SDK with SSML pronunciation control, cross-platform scripts, fuzz harness, and 28 tests. No scope creep or obviously missing pieces are visible from the diff alone.
⚠️ A human reviewer should verify that the PR fully satisfies #1417's acceptance criteria.
PR Template Compliance
| Section | Status | Notes |
|---|---|---|
| Description | ✅ | Clear, detailed, complete |
| Related Issue(s) | ✅ | Fixes #1417 present |
| Type of Change | See below | |
| Sample Prompts | ✅ | All four sub-sections filled |
| Testing | ✅ | 28 tests, specific commands listed |
| Checklist | ✅ | All required items addressed |
| Security Considerations | ✅ | All three items checked |
Type of Change — AI Artifacts deviation: The template's AI Artifacts subsection includes a first-line checkbox: [ ] Reviewed contribution with 'prompt-builder' agent and addressed all feedback. This checkbox was removed from the PR description rather than left unchecked with a justification. The equivalent items appear correctly in the Checklist › AI Artifact Contributions section with N/A explanations, which is a reasonable justification for a Python-script skill, but the Type of Change checkbox should remain present (unchecked + explanation) per the template structure.
Coding Standards
Reviewed against python-script.instructions.md, python-test.instructions.md, powershell.instructions.md, pester-testing.instructions.md, bash.instructions.md, prompt-builder.instructions.md, and the Markdown/frontmatter instructions.
| File | Verdict | Notes |
|---|---|---|
generate_voiceover.py |
Inline comment: assert in production path (line 349) |
|
embed_audio.py |
Inline comment: silent single-sequence timing removal (line 63) | |
SKILL.md |
✅ | Attribution footer, progressive disclosure, all required sections present |
Invoke-GenerateVoiceover.ps1 |
✅ | Copyright, #Requires, help, [CmdletBinding()], invocation guard all present |
Invoke-EmbedAudio.ps1 |
✅ | Same — fully compliant |
TtsVoiceoverHelpers.psm1 |
✅ | [OutputType], Export-ModuleMember, purpose comment all correct |
generate-voiceover.sh / embed-audio.sh |
✅ | Copyright, set -euo pipefail, main(), err(), usage() all present |
TtsVoiceoverHelpers.Tests.ps1 |
✅ | #Requires -Modules Pester first, -Tag 'Unit', BeforeAll/AfterAll cleanup, Should -Invoke used correctly |
pyproject.toml |
✅ | [tool.ruff], [tool.pytest.ini_options], fuzz group with atheris, python_files includes fuzz_harness.py — all required by validate:skills |
collections/experimental.collection.yml |
✅ | maturity: experimental is appropriate |
dependency-review.yml |
✅ | certifi (MPL-2.0) and charset-normalizer (MIT) allowances are justified and commented |
Code Quality & Security
✅ SSML XML injection mitigation: xml.sax.saxutils.escape(notes) is applied before apply_acronym_aliases, and alias values are protected with quoteattr. The order is correct and the docstring documents the pre-escape contract clearly.
✅ Entra ID token refresh: The 5-minute pre-expiry refresh is implemented and tested.
✅ YAML parsing: yaml.safe_load() is used throughout — no yaml.load() with a full Loader.
✅ Secret handling: SPEECH_KEY is read from environment variables, never hardcoded. No secrets appear anywhere in the diff.
✅ lxml explicit pin: Direct dependency with comment explaining the CVE-patch rationale is good practice.
assert for type-narrowing (generate_voiceover.py line 349): See inline comment. Python's -O flag disables all assert statements; an explicit if ... raise is more robust in production paths.
embed_audio.py line 63): See inline comment. The len > 1 guard means single-animation slides are silently overwritten. Either lower the threshold to >= 1 or document this in SKILL.md's Troubleshooting table.
Action Items
- [Required by template] Restore the
[ ] Reviewed contribution with 'prompt-builder' agentcheckbox to the Type of Change › AI Artifacts section (can remain unchecked with the existing N/A explanation). - [Suggested] Replace the
assert speech_resource_id is not Noneat line 349 ofgenerate_voiceover.pywith an explicit guard (see inline comment). - [Suggested] Adjust the
> 1sequence guard in_add_narration_timingto warn for any existing timing, or add a Troubleshooting entry inSKILL.mdfor users whose authored animations are removed (see inline comment).
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | none* embed_audio.py: warn on any existing p:timing replacement, include shape_id * generate_voiceover.py: replace assert with explicit RuntimeError (assert stripped under -O) * SKILL.md: document animation overwrite in Troubleshooting table 🔧 - Generated by Copilot
There was a problem hiding this comment.
Automated Quality Review — feat(skills): add tts-voiceover skill
Overview
This PR adds a well-scoped, thoroughly tested tts-voiceover experimental skill. The implementation is clean, the security posture is solid, and the cross-platform surface (Python + Bash + PowerShell) is consistent. Two inline suggestions are attached; neither is blocking.
🔗 Issue Alignment
The PR references Fixes #1417. The linked issue could not be read during this automated review pass due to repository integrity policy constraints. Based on the PR description, the implementation is coherent and self-consistent: the stated goals (YAML speaker-notes → WAV → PPTX embedding with SSML acronym control) are all addressed by the changed files. No scope creep was detected.
Action for maintainer: please verify that issue #1417's acceptance criteria are fully satisfied before merging.
✅ PR Template Compliance
| Section | Status | Notes |
|---|---|---|
| Description | ✅ | Clear, specific, well-structured |
| Related Issue | ✅ | Fixes #1417 present |
| Type of Change | See below | |
| Sample Prompts | ✅ | All four sub-sections filled |
| Testing | ✅ | 28 tests listed with commands |
| Required Checks | ✅ | All checked |
| Security Considerations | ✅ | All three items checked |
[ ] Reviewed contribution with 'prompt-builder' agent and addressed all feedback in the AI Artifacts block of the Type of Change section. The PR body removes this checkbox entirely rather than leaving it unchecked with an N/A annotation. The intent is addressed in the AI Artifact Contributions checklist section (marked N/A with rationale), so this is not blocking, but the template section should preserve the checkbox for audit completeness:
**AI Artifacts:**
* [ ] Reviewed contribution with `prompt-builder` agent and addressed all feedback — N/A: skill contains Python scripts, not prompt/agent artifacts
* [x] Copilot skill (`.github/skills/*/SKILL.md`)📋 Coding Standards
| File / Area | Finding |
|---|---|
Python scripts (generate_voiceover.py, embed_audio.py) |
✅ Follows python-script.instructions.md — entry points, exit codes, argparse, pathlib, type hints, logging, docstrings all present |
PowerShell module (TtsVoiceoverHelpers.psm1) |
✅ Follows powershell.instructions.md — copyright header, #Requires -Version 7.0, CmdletBinding, OutputType, Export-ModuleMember |
PowerShell scripts (Invoke-GenerateVoiceover.ps1, Invoke-EmbedAudio.ps1) |
✅ Shebang, copyright, #Requires, comment-based help, CmdletBinding, ErrorActionPreference, module import, region blocks, main guard all present |
Bash scripts (generate-voiceover.sh, embed-audio.sh) |
✅ Follows bash.instructions.md — shebang, copyright, set -euo pipefail, err/usage functions, main pattern |
pyproject.toml |
✅ All required sections present: [tool.ruff], [tool.ruff.lint], [tool.pytest.ini_options], python_files including fuzz_harness.py, fuzz dependency group with atheris>=3.0 |
| Collections | ✅ Both experimental.collection.yml/.md and hve-core-all.collection.yml/.md updated with correct maturity: experimental |
SKILL.md frontmatter |
✅ name, description with attribution suffix, metadata block all present; attribution footer > Brought to you by microsoft/hve-core present |
docs/getting-started/tts-voiceover.md |
✅ Full Docusaurus frontmatter (title, description, sidebar_position, author, ms.date, ms.topic, keywords, estimated_reading_time); no H1 after frontmatter (correct) |
dependency-review.yml |
✅ License exceptions for certifi and charset-normalizer properly documented and formatted |
🔒 Code Quality and Security
Security — ✅ No vulnerabilities found:
xml.sax.saxutils.escape()applied to speaker notes before acronym substitution prevents SSML injection via slide content.xml.sax.saxutils.quoteattr()used forvoiceandrateCLI arguments inwrap_ssml— protected against injection via command-line inputs.- Acronym
aliasvalues escaped withquoteattr, acronym keys escaped withescapein_replace— correct per SSML<sub>semantics. SPEECH_KEY/SPEECH_RESOURCE_IDsourced from environment variables, never hardcoded.- Entra ID token refresh at
expires_on - 300susing integer timestamp comparison — correct forazure-identityAccessToken.expires_on. etree.fromstring()parses a developer-constructed string (no user data interpolated except integerduration_msandshape_id) — safe.
Two inline suggestions are attached (see diff comments):
- Line 76 —
xmlns:anamespace declaration is unused in the timing XML; minor cleanup. - Line 115 —
advClick="1"allows click-to-advance during audio playback; recommend documenting this behaviour or evaluatingadvClick="0"for audio-only auto-advance.
📊 Review Summary
| Category | Count |
|---|---|
| Security vulnerabilities | 0 |
| Critical / blocking findings | 0 |
| Suggestions (inline) | 2 |
| Template compliance notes | 1 |
The skill is production-quality for an experimental artifact. Suggest addressing the advClick behaviour (even if only in documentation) before promoting this skill out of experimental status.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | none…advance * embed_audio.py: remove unused DrawingML namespace from timing XML * embed_audio.py: set advClick=0 so slides advance only on timer, not click 🔧 - Generated by Copilot
There was a problem hiding this comment.
PR Review — feat(skills): add tts-voiceover skill for Azure Speech SDK voice-over generation
This is a well-crafted skill addition with good security hygiene (XML-escaping before acronym substitution, Entra ID token refresh, quoteattr on voice/rate), comprehensive cross-platform scripts (bash + PowerShell), solid test coverage (28 tests + fuzz harness), and thorough documentation. The structured approach to dependency allowlisting in dependency-review.yml is correct and well-commented.
Two issues require resolution before merge.
🔍 Issue Alignment
Issue #1417 was linked via "Fixes #1417". The PR description is coherent and describes a complete, self-contained skill. Issue alignment appears intact based on the PR description content; no misalignment detected.
📋 PR Template Compliance
All required sections are filled in. The prompt-builder review checkbox is marked N/A with the justification "skill contains Python scripts, not prompt/agent artifacts" — this is a reasonable interpretation since the prompt-builder workflow targets prompt/agent artifacts rather than Python-heavy skills.
All automated check boxes are marked as passing.
⚠️ Coding Standards
Critical
- Word boundary missing in
_compile_acronym_pattern(see inline comment ongenerate_voiceover.py:88)
Short acronym keys (AST,ISE,GSN,RAI,RPI,MCP) can substring-match inside longer all-uppercase words (e.g.,LAST→L<sub>AST</sub>), producing malformed SSML narration. Adding\bboundaries fixes this without breakingCI/CD-style keys.
Non-critical
pytest-mockdeclared but unused (see inline comment onpyproject.toml:17)
python-test.instructions.mdprefersmockerfixtures for new projects. Either adoptmockerin the test files or remove the unused dependency.
🔒 Code Quality & Security
| Area | Finding |
|---|---|
| SSML injection | ✅ xml.sax.saxutils.escape() applied to speaker notes before alias substitution |
| Voice/rate injection | ✅ quoteattr() used on command-line-supplied voice and rate values |
| Shape ID / duration in timing XML | ✅ Both are int — no XML injection risk |
| Entra ID token refresh | ✅ Correct 5-minute pre-expiry refresh with explicit use_entra_auth guard |
| Key vs. Entra ID priority | ✅ Documented warning when both are set; key takes precedence |
| Timing element replacement | ✅ Warns when authored animations will be overwritten; documented in Troubleshooting |
| WAV duration buffer | ✅ TIMING_BUFFER_MS = 1500 added to prevent premature slide advance |
| Output path collision | ✅ Checked with .resolve() before save |
✅ Action Items
- Required — fix word boundaries in
_compile_acronym_pattern(see inline comment) - Required — address unused
pytest-mockdependency (see inline comment; either adoptmockerfixtures or remove)
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | none…matches 🔧 - Generated by Copilot
There was a problem hiding this comment.
PR Review: feat(skills): add tts-voiceover skill — users/auyidi/tts-voiceover-skill
This is a well-scoped, well-tested addition of a new experimental skill. The implementation demonstrates solid security awareness (SSML XML-injection prevention, Entra ID token refresh, dry-run mode) and the cross-platform script coverage (Python + bash + PowerShell) matches the codebase's skill conventions. Four findings are noted below; none are blocking.
🔍 Issue Alignment
Linked issue #1417 could not be read directly due to system limitations. Based on the PR description and the implementation, the skill clearly delivers TTS voice-over generation with Azure Speech SDK, SSML acronym aliases, dual authentication paths, and PowerPoint embedding — all consistent with a well-described feature request. No obvious scope creep or missing requirements detected from what is visible.
📋 PR Template Compliance
The description, testing, checklist, and security sections are all properly filled in.
[ ] Reviewed contribution with \prompt-builder` agent and addressed all feedback` checkbox that exists in the template. While the Checklist → AI Artifact Contributions section explains this is N/A (skill is Python scripts, not prompt/agent artifacts), the template's structural items should be preserved as unchecked with a brief rationale comment rather than removed entirely. This is informational — the explanation is present, just not in the expected location.
🛠️ Coding Standards
Two convention violations per python-script.instructions.md and two documentation gaps in bash wrappers. See inline comments for details:
| File | Finding |
|---|---|
generate_voiceover.py |
configure_logging() defined after _run() — should precede it |
embed_audio.py |
Same configure_logging() ordering issue |
generate-voiceover.sh |
--verbose / -v passthrough flag absent from usage() help |
embed-audio.sh |
Same missing flag in usage() |
✅ Code Quality
- XML injection:
speaker_notesare correctly escaped viaxml.sax.saxutils.escape()before acronym substitution;apply_acronym_aliases()documents this contract clearly. - Narration XML:
duration_msandshape_idare integer-typed, so the f-string XML construction in_add_narration_timing()carries no injection risk. - Entra ID token refresh: The graceful fallback on refresh failure (
logger.exceptionthen continue) is appropriate — retrying with a slightly-stale token is better than aborting a long-running batch. - Orphaned shape cleanup: The
elsebranch inembed_slide_audio()that removes the orphaned<p:sp>element is defensive and correct. - Dependency licensing: The additions to
dependency-review.yml(certifi/ MPL-2.0,charset-normalizer/ MIT) are correctly commented with rationale.
💡 Action Items
- Move
configure_logging()to appear betweencreate_parser()and_run()in both Python scripts (see inline comments). - Add
-v, --verboseto theusage()options block in both bash wrappers (see inline comments). - (Optional) Restore the
[ ] Reviewed contribution with \prompt-builder` agent` checkbox to the Type of Change → AI Artifacts section as an explicitly unchecked, annotated item rather than omitting it entirely.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | none…e in shell wrappers * embed_audio.py, generate_voiceover.py: reorder configure_logging between create_parser and _run * embed-audio.sh, generate-voiceover.sh: add -v/--verbose to usage Options block 🔧 - Generated by Copilot
There was a problem hiding this comment.
Review Summary
This PR adds the tts-voiceover skill — a well-structured, security-conscious addition that generates per-slide WAV narrations from YAML speaker notes and embeds them into PPTX decks via the Azure Speech SDK. The code is clean overall, the test coverage is solid, and the security mitigations (SSML XML injection escaping, Entra ID token refresh, quoteattr for SSML attributes, no hardcoded secrets) are all present and correct.
🔍 Issue Alignment
Issue #1417 is linked via Fixes #1417. The PR description describes a coherent feature scope (TTS generation, PPTX embedding, both auth methods, dry-run mode) with no obvious scope creep. The changes are self-consistent.
📋 PR Template Compliance
prompt-builder review attestation in the Type of Change section.
The PR template's AI Artifacts section includes a required checkbox:
[ ] Reviewed contribution with 'prompt-builder' agent and addressed all feedback
This checkbox is absent from the PR description's Type of Change section. In the checklist, you mark it N/A with the reasoning "skill contains Python scripts, not prompt/agent artifacts." While SKILL.md is primarily a skill documentation file rather than a prompt artifact, it is still an AI artifact covered by the template requirement. The template note says: "If contributing an agent, prompt, instruction, or skill, complete these checks."
Please either:
- Run
/prompt-analyzeagainstSKILL.mdand check both boxes after addressing feedback, or - Leave the items unchecked with a more specific justification (e.g., confirm no prompting instructions are embedded in SKILL.md beyond documentation).
✅ Coding Standards
All Python files follow the repository's Python script conventions:
- Copyright headers ✅
from __future__ import annotations✅sys.exit(main())guard ✅pathlib.Pathfor all file operations ✅- Module-level logger ✅
- Google-style docstrings ✅
- BDD test naming with AAA structure ✅
pyproject.tomlwith[tool.ruff],[tool.pytest.ini_options], and fuzz dependency group ✅
Shell scripts follow bash conventions:
#!/usr/bin/env bash+set -euo pipefail✅- Copyright headers ✅
main()pattern +err()+usage()✅
PowerShell scripts follow #Requires -Version 7.0 + copyright header + [CmdletBinding()] conventions ✅
GitHub Actions workflow changes:
- New
advanced-security/component-detection-dependency-submission-actionstep is SHA-pinned ✅ persist-credentials: falseis set ✅
🔒 Code Quality & Security
Three suggestions are raised as inline comments:
-
Medium —
generate_voiceover.pyline 369: No cleanup of partial WAV files whengenerate_audiofails. The Azure Speech SDK may write an empty file before reporting failure;embed_audio.pywill attempt to embed it downstream. Suggest addingwav_path.unlink(missing_ok=True)in the failure branch. -
Low —
embed_audio.pyline 101:_add_narration_timinguses f-string substitution into raw XML. Inputs are integers so there's no injection risk today, but it diverges from the lxml API already imported. Suggest building the dynamic values into the static template after parsing rather than inline in the string. -
Low —
test_generate_voiceover.pyline 273: The Entra ID token refresh branch (time.time() > token_expires_at - 300) is not covered by any test. The catch-allexcept Exceptionin that path silently swallows credential errors. A mock-based test would add a regression guard without requiring Azure credentials.
📝 Action Items
- Address or acknowledge the
prompt-builderchecklist gap in the Type of Change section. - Consider adding WAV file cleanup on synthesis failure (blocking for correctness in multi-slide batches).
- Consider the lxml API refactor for
_add_narration_timing(non-blocking, maintainability). - Consider adding a mock-based test for the Entra ID token refresh path (non-blocking, coverage gap).
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | none…L template * generate_voiceover.py: unlink partial WAV after failed synthesis * embed_audio.py: replace f-string timing XML with static template + xpath attribute set 🔧 - Generated by Copilot
There was a problem hiding this comment.
PR Review: feat(skills): add tts-voiceover skill
Overall this is a well-structured, high-quality addition. The implementation follows repository coding standards closely, includes comprehensive tests (28 unit tests + a polyglot fuzz harness), handles the SSML injection vector correctly, and ships cross-platform wrapper scripts as required for skills. The few items below are non-blocking suggestions.
Issue Alignment
#1417 is linked (Fixes #1417). The PR description is detailed and specific about what was added. ✅
PR Template Compliance
Minor deviation: The AI Artifacts / Type of Change section omits the Reviewed contribution with prompt-builder agent and addressed all feedback line entirely rather than leaving it unchecked with an inline (N/A) annotation. The Checklist section does address this with explicit N/A notes, which provides the required context — but keeping the checkbox in the Type of Change block (even unchecked + annotated) preserves template structure for future reviewers scanning the PR header.
All other required checkboxes are checked and match the diff:
- New feature ✅ — new skill added
- Documentation update ✅ —
docs/getting-started/tts-voiceover.mdadded - GitHub Actions workflow ✅ —
dependency-review.ymlmodified - Copilot skill ✅ —
SKILL.mdadded - Script/automation ✅ —
.py,.ps1,.shfiles added
Coding Standards
The code is in strong compliance across all applicable instruction files:
| File type | Instruction file | Status |
|---|---|---|
.py |
python-script.instructions.md |
✅ All conventions followed (copyright, SPDX, from __future__, argparse, pathlib, exit codes, module-level logger) |
.py (tests) |
python-test.instructions.md |
✅ pytest + pytest-mock, given_/when_/then_ naming, AAA structure |
.ps1 / .psm1 |
powershell.instructions.md |
✅ Copyright headers, #Requires -Version 7.0, comment-based help, [CmdletBinding()], $ErrorActionPreference, region blocks, invocation guard |
.ps1 (tests) |
pester-test.instructions.md |
✅ #Requires -Modules Pester, -Tag 'Unit', BeforeAll/AfterAll, -ModuleName mocking |
.sh |
bash.instructions.md |
✅ set -euo pipefail, copyright, err()/usage() helpers, main "$@" pattern |
SKILL.md |
prompt-builder.instructions.md |
✅ name, description, attribution footer, all required sections |
pyproject.toml |
Skill structure | ✅ [tool.ruff], [tool.pytest.ini_options], fuzz group, fuzz_harness.py |
.md files |
markdown.instructions.md / docusaurus.instructions.md |
✅ Frontmatter present with required fields, no H1 with frontmatter title |
Code Quality
Secure — SSML injection: Speaker notes are XML-escaped via xml.sax.saxutils.escape() before acronym alias substitution, and alias values are further protected via quoteattr(). The lexicon validator also warns when keys contain XML-special characters that would never match post-escaping. ✅
Secure — credentials: SPEECH_KEY and SPEECH_RESOURCE_ID are read from environment variables; no secrets are hardcoded. ✅
Entra ID token refresh: The 5-minute-before-expiry guard is correct and the RuntimeError on invariant violation (rather than a stripped assert) is the right approach per the comment. ✅
One code quality finding (see inline comment on embed_audio.py lines 105–111): _add_narration_timing silently succeeds when sp_tgt or ctn_dur is None. Since these elements carry the audio shape reference and playback duration respectively, a missing element means PowerPoint cannot associate the narration — but the caller logs "Embedded slide-NNN.wav" and returns True. A logger.warning in the else branches would surface this without changing behavior.
_TIMING_TEMPLATE inline XML: Using a string constant parsed by etree.fromstring is reasonable for a known-good template. The namespace is correctly re-bound when searching with ns = {"p": _PPTX_NS} so there is no prefix mismatch risk. ✅
Collection updates: Both experimental.collection.yml and hve-core-all.collection.yml correctly set maturity: experimental matching the skill's placement under .github/skills/experimental/. ✅
dependency-review.yml: The allow-listing of certifi (MPL-2.0) and charset-normalizer (MIT) for the Azure SDK transitive dependency chain is justified by the inline comments. ✅
Summary
| Check | Result |
|---|---|
| Issue linked | ✅ Fixes #1417 |
| PR description | ✅ Complete and specific |
| Type of Change | ✅ All checked boxes match diff |
| Testing | ✅ 28 tests + fuzz harness |
| Coding standards | ✅ Compliant across all file types |
| Security | ✅ No vulnerabilities found |
| Inline finding | 💡 _add_narration_timing silent failure — add logger.warning for None element lookups |
No blocking issues. The one inline suggestion is straightforward to address and improves debuggability for future maintainers.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | none* embed_audio.py: log warning when spTgt or cTn[@id='5'] not found in parsed template 🔧 - Generated by Copilot
There was a problem hiding this comment.
Automated Quality Review
Overall this is a well-crafted addition. Security posture is strong, Python conventions are followed, and the test coverage is solid. The inline comments flag two non-blocking code-quality observations in embed_audio.py.
Issue Alignment
✅ PR links to #1417 and implements the described tts-voiceover skill: SKILL.md, Python scripts (generate_voiceover.py, embed_audio.py), PowerShell and bash wrappers, Pester and pytest test suites, Docusaurus docs, and collection/plugin updates. All declared requirements appear to be addressed.
PR Template Compliance
✅ Description is filled in, Related Issue references #1417, and Type of Change checkboxes all match the actual diff (New feature, Documentation update, GitHub Actions workflow, Copilot skill, Script/automation).
Reviewed contribution with \prompt-builder` agent and addressed all feedback" checkbox entirely, rather than rendering it unchecked with an inline N/A note. The author _has_ addressed this in the **Checklist → AI Artifact Contributions** section with an explicit N/A annotation, so the intent is clear — but matching the full template structure (showing the unchecked item with a — N/A: ...` suffix directly in the AI Artifacts block) would make it easier for reviewers to verify compliance at a glance.
✅ All Required Automated Checks and Security Considerations checkboxes are checked.
Coding Standards
✅ Python scripts follow python-script.instructions.md conventions: EXIT_SUCCESS/FAILURE/ERROR constants, create_parser() separated from _run(), configure_logging(), if __name__ == "__main__": sys.exit(main()) guard, Google-style docstrings, and pathlib.Path throughout.
✅ Python tests follow python-test.instructions.md naming (test_given_context_when_action_then_expected), Arrange/Act/Assert structure, and pytest-mock usage.
✅ PowerShell scripts follow powershell.instructions.md: copyright headers, #Requires -Version 7.0, [CmdletBinding()] + typed params, $ErrorActionPreference = 'Stop', invocation guard, and Export-ModuleMember in the module.
✅ Bash scripts follow bash.instructions.md: set -euo pipefail, #!/usr/bin/env bash, copyright header, err() helper, main() pattern with main "$@".
Code Quality
Two non-blocking observations are filed as inline comments:
-
embed_audio.pyline 72 —_PPTX_NSis defined as a local variable inside_add_narration_timing()but the same string is also embedded directly in_TIMING_TEMPLATEtwo lines later. A divergence between the two would silently produce malformed XML; promote to a module-level constant. -
embed_audio.pyline 162 —shape_id: int | Noneis broader than python-pptx's actual contract.shape_idis always anint; the misleading annotation could prompt future defensive checks for aNonecase that never occurs.
No bugs, no silent failures, no resource leaks found.
Security
✅ XML/SSML injection is correctly prevented: xml.sax.saxutils.escape() is applied to raw speaker notes before the acronym regex runs, and quoteattr() is used for all SSML attribute values.
✅ Credential handling uses environment variables (SPEECH_KEY, SPEECH_RESOURCE_ID, SPEECH_REGION) — no hardcoded secrets.
✅ Entra ID token refresh (5-minute pre-expiry window) and the speech_resource_id is None runtime guard are solid defensive patterns.
✅ dependency-review.yml allowlist additions (certifi, charset-normalizer) are accompanied by clear rationale comments.
Action Items
- (Optional) Promote
_PPTX_NSto module level inembed_audio.py— see inline comment on line 72. - (Optional) Tighten
shape_idannotation tointinembed_audio.py— see inline comment on line 162. - (Optional) In the AI Artifacts section of the PR description, include the unchecked "
Reviewed contribution with \prompt-builder` agent" line with a— N/A: skill wraps Python scripts` annotation to match the full template structure.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | none…d type * embed_audio.py: promote _PPTX_NS and _TIMING_TEMPLATE to module-level constants * embed_audio.py: change shape_id annotation from int | None to int (always an int in python-pptx) * embed_audio.py: remove dead None branch for shape_id 🔧 - Generated by Copilot
There was a problem hiding this comment.
Automated Review — PR #1415: tts-voiceover Skill
This PR introduces a well-structured new skill for generating Azure Speech SDK voice-over audio from PPTX speaker notes, embedding the resulting WAV files back into the deck. The overall code quality is high. Two inline findings require author attention, and one template compliance gap is noted below.
Issue Alignment ✅
- Linked issue
Fixes #1417is present. The PR description and changed files are consistent with implementing a new TTS voice-over skill.
PR Template Compliance ⚠️
Type of Change → AI Artifacts section is incomplete. The PR template requires both of the following checkboxes in the Type of Change section:
* [ ] Reviewed contribution with `prompt-builder` agent and addressed all feedback
* [x] Copilot skill (`.github/skills/*/SKILL.md`)
The PR body contains only the Copilot skill checkbox. The first checkbox — Reviewed contribution with prompt-builder agent — is absent entirely from the Type of Change section. The Checklist → AI Artifact Contributions section does annotate the first two items as N/A, which addresses intent, but the Type of Change section still needs to include the missing checkbox (either checked or explicitly marked N/A if not applicable).
All other template sections (Description, Testing, Checklist) are present and appropriately filled.
Coding Standards ✅
All applicable instruction files were evaluated:
| File type | Instruction file | Result |
|---|---|---|
*.py |
python-script.instructions.md, python-test.instructions.md |
✅ Compliant |
*.ps1 / *.psm1 |
powershell.instructions.md, pester-testing.instructions.md |
✅ Compliant |
*.sh |
bash.instructions.md |
✅ Compliant |
SKILL.md |
prompt-builder.instructions.md |
✅ Compliant |
*.yml (workflow) |
github-actions-workflows.instructions.md |
✅ Compliant |
Highlights:
- Copyright headers + SPDX identifiers on all scripts ✅
- Python entry-point pattern (
main() → sys.exit(main())) ✅ create_parser()extracted for testability;configure_logging()present ✅pathlib.Paththroughout; noos.path✅yaml.safe_loadused (not bareyaml.load) ✅- PowerShell:
[CmdletBinding()],param(),$ErrorActionPreference = 'Stop',#region/#endregion, invocation guard ✅ - Module: purpose comment,
#Requires -Version 7.0,Export-ModuleMember✅ - Bash:
set -euo pipefail,main "$@",usage(), array-based conditional argument construction ✅ - SKILL.md:
name,description(with attribution suffix),metadata, attribution footer, both bash+PS wrappers ✅ pyproject.toml:[tool.ruff],[tool.pytest.ini_options],fuzzdependency group,python_filesincludesfuzz_harness.py✅- Collection YMLs updated with
maturity: experimental✅
Code Quality and Security
Security ✅
- Speaker notes are XML-escaped with
xml.sax.saxutils.escape()before acronym substitution, preventing SSML injection through malicious note content. - Voice name and rate attributes are wrapped with
xml.sax.saxutils.quoteattr(). - No shell injection vectors: all subprocess calls use
Pathobjects, never string interpolation into shell commands. - Entra token refresh happens 5 minutes before expiry, preventing mid-run auth failures.
- Failed WAV files are removed before continuing to avoid embedding partial/corrupt audio.
Code quality concerns: see inline comments
-
embed_audio.pyline 254 — Slide index (idx) derived fromenumerate(prs.slides, start=1)maps toslide-{idx:03d}.wav, butgenerate_voiceover.pynames WAV files from content directory names. Non-contiguous directories (e.g.,slide-001/,slide-003/) will silently leave slides without audio. This assumption is undocumented. -
generate_voiceover.pyline 323 —credentialis initialized toNoneand the type checker cannot verify it is non-Noneat the token refresh_make_entra_configcall. Acredential is Noneguard or narrowed type annotation would make the invariant explicit and prevent a confusingAttributeErrorif the initialization logic is ever refactored.
Action Items
- Add the missing
Reviewed contribution with prompt-builder agentcheckbox to the Type of Change → AI Artifacts section. - Address the slide index / directory name mismatch (inline comment on
embed_audio.py): either document the "no-gaps" constraint prominently, or refactor to build the WAV mapping directly from directory names. - Add a
credential is Noneguard at the token refresh call site ingenerate_voiceover.py.
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | none…mapping * generate_voiceover.py: add credential is None to RuntimeError guard at token refresh * embed_audio.py: build WAV mapping from directory names instead of deriving from enumerate index 🔧 - Generated by Copilot
There was a problem hiding this comment.
PR Review: feat(skills): add tts-voiceover skill for Azure Speech SDK voice-over generation
This is a well-implemented, complete skill addition. All required structural, security, and convention checks pass. The three inline comments are informational suggestions with no blocking issues.
Issue Alignment
Linked issue #1417 is not accessible due to permission restrictions, so alignment cannot be verified directly. The PR description is detailed and self-consistent — it describes the problem space, implementation approach, authentication options, and acceptance criteria in sufficient depth to confirm intentional, scoped work. No scope creep is apparent.
PR Template Compliance ✅
All required sections are completed:
- Description is detailed and substantive.
- Related Issues references #1417.
- Type of Change checkboxes match the actual changes:
Copilot skill,GitHub Actions workflow,Documentation update, andNew featureare all checked and all correspond to real changes in the diff. - The AI Artifact Contributions checklist items are marked
N/Awith justification — acceptable for a Python script skill where the prompt-builder review loop is not the primary authoring mechanism. - Testing section documents actual test results (28 tests passing) and the manual validation steps performed.
- Security Considerations is fully attested, with additional notes in the PR body explaining the XML injection mitigation in
wrap_ssml.
Coding Standards
Minor non-blocking: Multiple public functions across generate_voiceover.py and embed_audio.py have single-line docstrings that omit the Args:, Returns:, and Raises: sections required by python-script.instructions.md. See inline comment on line 60 of generate_voiceover.py for the affected functions and an example fix. This is the only objective convention deviation found.
All other conventions are followed:
SKILL.md— frontmatter complete, all required sections present, attribution footer present, skill directory structure matches the spec.pyproject.toml— all required fields are present:[tool.ruff],[tool.ruff.lint],[tool.pytest.ini_options]withpython_files,fuzzdependency group withatheris>=3.0,ruffin dev dependencies.- PowerShell scripts — copyright headers,
#Requires -Version 7.0, comment-based help,[CmdletBinding()]/[OutputType()],$ErrorActionPreference = 'Stop', and correctExport-ModuleMember. - Bash scripts — shebang, copyright headers,
set -euo pipefail,err(),usage(), all present. - Workflow —
permissions:declared, all third-party actions are SHA-pinned,persist-credentials: falseis set.
Code Quality and Security ✅
Security — XML injection: wrap_ssml escapes all user-supplied text with html.escape before embedding in SSML. The explicit note in the PR description and the Additional Notes section of the security checklist confirm this was a deliberate fix. ✅
Security — Entra ID token refresh: The 5-minute pre-expiry refresh in _run (expired_at = now + timedelta(seconds=token_expiry - 300)) is implemented correctly and tested. ✅
Usability gap — advClick="0" side effect: _set_slide_transition sets advClick="0", which disables manual click-to-advance on all embedded slides. The inline comment explains the developer intent but this behaviour is not surfaced to end users in SKILL.md or the getting-started doc. See inline comment on embed_audio.py line 137 for a suggested Troubleshooting entry.
Dependency allowlist comment: The charset-normalizer entry in dependency-review.yml has a comment that correctly states it is MIT licensed but does not explain why an explicit override is still needed. See inline comment for a suggested clarification.
No bugs or logic errors found. Path handling, error propagation, resource cleanup, and logging are well-implemented throughout.
Action Items for Author
| # | File | Severity | Action |
|---|---|---|---|
| 1 | generate_voiceover.py, embed_audio.py |
🟡 Minor | Add Google-style Args:/Returns:/Raises: to public function docstrings |
| 2 | embed_audio.py line 137 → SKILL.md |
🟡 Minor | Document advClick="0" side effect in the Troubleshooting table |
| 3 | dependency-review.yml line 60 |
🟢 Low | Clarify why charset-normalizer needs an explicit exception despite being MIT |
Note
🔒 Integrity filter blocked 1 item
The following item were blocked because they don't meet the GitHub integrity level.
- #1417
issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | none|
|
||
|
|
||
| def load_acronyms(path: Path) -> dict[str, str]: | ||
| """Load acronym aliases from YAML, falling back to built-in defaults.""" |
There was a problem hiding this comment.
Convention: Google-style docstrings required by python-script.instructions.md
Multiple public functions in this file (and in embed_audio.py) have single-line or abbreviated docstrings that omit the Args:, Returns:, and Raises: sections required by the repo's coding standard.
Per python-script.instructions.md:
Use Google-style docstrings with Args, Returns, Raises, and Example sections.
For example, load_acronyms should read:
def load_acronyms(path: Path) -> dict[str, str]:
"""Load acronym aliases from YAML, falling back to built-in defaults.
Args:
path: Path to a YAML file whose top-level ``acronyms`` key maps
acronym strings to phonetic replacement strings.
Returns:
A mapping of acronym keys to their replacement strings. Falls
back to ``_DEFAULT_ACRONYMS`` when the file is absent or malformed.
"""The same pattern applies to apply_acronym_aliases, wrap_ssml, generate_audio, _make_entra_config, and _resolve_lexicon, and to get_wav_duration_ms, _add_narration_timing, _set_slide_transition, and embed_slide_audio in embed_audio.py.
| transition = slide._element.makeelement( | ||
| qn("p:transition"), | ||
| {"advClick": "0", "advTm": str(duration_ms)}, | ||
| ) |
There was a problem hiding this comment.
User-facing behaviour gap: advClick="0" disables click-to-advance and is not documented
Setting advClick="0" is a deliberate and sensible choice for an automated kiosk / video-export workflow, and the inline comment (lines 133–134) explains the intent clearly for developers.
However, this is a visible change to the presentation experience that many users will not expect: after embedding audio with this script, mouse-clicks and keyboard presses no longer advance slides manually. If a presenter wants to review the deck interactively after embedding, or deliver a hybrid live/auto-advance session, they will be blocked with no guidance.
The existing Troubleshooting table in SKILL.md already has a row for "Authored slide animations missing after embedding" — adding a companion row here would close the gap:
| Slides no longer advance on click after embedding | `embed_audio.py` sets `advClick="0"` so slides advance automatically after the audio timer. To re-enable click-to-advance, open the deck in PowerPoint, select all slides, and set **Advance Slide → On Mouse Click** in the Transitions tab. |Suggested location: after the "Authored slide animations" row in SKILL.md.
| # from bundled code; distributed licenses are permissive. | ||
| # pkg:npm/hve-core is the private root package (never published to npm). | ||
| # pkg:pypi/certifi uses MPL-2.0 (Mozilla CA bundle). | ||
| # pkg:pypi/charset-normalizer is MIT licensed; allowed for Azure SDK |
There was a problem hiding this comment.
Clarify why charset-normalizer needs an explicit allow-dependencies-licenses entry
The comment says charset-normalizer is MIT licensed, yet it still requires an explicit exception. That is counterintuitive — if the SPDX identifier were simply MIT, the allow-licenses block (which already includes MIT) would cover it without an explicit override.
Please add a brief note explaining the root cause. For example:
# pkg:pypi/charset-normalizer is released as MIT but the wheel metadata
# declares a compound SPDX expression ("MIT OR GPL-2.0-only") in some
# older releases; the action treats compound expressions as a mismatch
# even when the primary identifier is permissive.This keeps the rationale auditable so future reviewers don't remove the entry thinking it is redundant with allow-licenses.
Pull Request
Description
Add a new
tts-voiceoverskill to the experimental collection. This skill generates per-slide WAV voice-over files from YAML speaker notes using the Azure Speech SDK with SSML pronunciation control for technical acronyms.The skill integrates with the existing PowerPoint skill's
content.yamlspeaker notes and supports both key-based and Microsoft Entra ID token authentication.Key features
generate_voiceover.py: Readscontent.yamlfiles, applies SSML acronym aliases, generates WAV files per slideembed_audio.py: Embeds WAV files into PPTX decks via python-pptx--voice), speech rate (--rate), and custom acronym lexicon (--lexicon)SPEECH_KEY(key auth) andSPEECH_RESOURCE_ID(Entra ID auth)Related Issue(s)
Fixes #1417
Type of Change
Code & Documentation:
Infrastructure & Configuration:
AI Artifacts:
.github/skills/*/SKILL.md)Other:
.ps1,.sh,.py)Sample Prompts (for AI Artifact Contributions)
User Request:
Execution Flow:
python scripts/generate_voiceover.py --dry-run --content-dir contentto verify SSMLpython scripts/generate_voiceover.py --content-dir content --output-dir voice-overto generate WAV filespython scripts/embed_audio.py --input deck.pptx --audio-dir voice-overto embed audioOutput Artifacts:
voice-over/slide-001.wavthroughslide-NNN.wav(one WAV per slide)*-narrated.pptx(PPTX with embedded audio)Success Indicators:
speaker_notes:produce WAV files--dry-runoutput shows valid SSML with<sub alias>elementsTesting
npm run validate:skillspasses (17/17 skills, 0 errors)npm run lint:pypasses (ruff clean)npm run lint:frontmatterpasses (0 errors)npm run lint:collections-metadatapasses (0 errors, 14 collections)npm run plugin:generatecompletes without errorsnpm run spell-checkpasses for changed files (SSML added to cspell dictionary)apply_acronym_aliases,wrap_ssml,load_acronymstest_embed_audio.py: 6 tests covering WAV duration, narration timing XML, embed success/failuretest_generate_voiceover.py: 9 tests covering lexicon resolution, parser defaults, dry-run pathsChecklist
Required Checks
AI Artifact Contributions
/prompt-analyzeto review contribution — N/A: skill contains Python scripts, not prompt/agent artifactsprompt-builderreview — N/A: see aboveRequired Automated Checks
npm run lint:mdnpm run spell-checknpm run lint:frontmatternpm run validate:skillsnpm run lint:md-linksnpm run lint:psnpm run plugin:generatenpm run docs:testSecurity Considerations
Additional Notes
mcp-security.instructions.mdreference fromhve-core-all.collection.yml